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EXECUTIVE  SUMMARY 


Small  UAVs  will  be  used  with  growing  frequeney  in  the  near  future  for  military 
operations.  As  SUAVs  progress  from  being  novelties  and  toys  to  beeoming  full  members 
of  the  military  arsenal,  their  reliability  and  availability  must  begin  to  approaeh  the  levels 
expeeted  of  military  systems.  They  eurrently  miss  those  levels  by  a  wide  margin. 

The  military  has  wide  experienee  with  the  need  for  reliability  improvement  in 
systems,  and  in  faet  developed  or  funded  the  development  of  many  of  the  methods 
discussed  in  this  thesis.  These  methods  have  not  yet  been  applied  to  SUAVs. 

The  projection  of  reliability  experience  from  manned  piloted  aviation  to  UAVs 
has  led  to  overestimation  of  the  UAV  reliability.  Real  and  urgent  operational  demands  in 
the  Persian  Gulf,  Kosovo,  and  Afghanistan  have  highlighted  the  very  low  levels  of 
reliability  of  UAVs  compared  to  manned  air  vehicles. 

To  make  a  decision,  one  needs  analytical  support.  Analytical  support  requires 
models.  Models  require  good  data.  Good  data  requires  systems  to  collect  and  archive  it 
for  easy  retrieval.  When  I  began  this  thesis,  I  thought  that  good  data  on  SUAV  reliability 
would  be  easily  available  for  analysis.  I  was  mistaken.  That  is  why  the  majority  of  this 
thesis  has  discussed  data  collection  systems  and  argued  that  some  (but  not  all)  need  to  be 
applied  to  SUAVs.  For  ease  of  implementation,  we  adapted  forms  from  commercial  use 
for  FMECA  and  FRACAS  systems  for  SUAVs,  and  constructed  a  very  detailed  FTA  for 
a  typical  SUAV.  This  work  is  more  typical  of  a  reliability  engineering  thesis,  but  was 
necessary  to  enable  any  operational  analysis. 

With  the  existing  crude  data  on  one  UAV  system,  I  was  able  to  perform  a  crude 
analysis  using  a  reliability  growth  model  based  on  Duane’s  postulate.  With  good  data,  the 
Navy  will  be  able  to  do  much  more,  as  outlined  in  the  thesis. 

The  DoD  Reliability  Primer  is  currently  under  extensive  revision.  In  the 
meantime,  this  thesis  can  serve  as  a  survey  of  the  reliability  methods  that  are  applicable 
to  SUAVs  and  as  template  for  the  implementation  of  FMECA,  ETA,  and  ERACAS 


methods  for  reliability  improvement  for  SUAVs.  As  with  all  surveys,  it  has  depended  on 
the  work  of  the  original  authors,  whieh  I  have  borrowed  liberally  and  documented 
extensively.  The  adaptation  of  these  methods  for  SUAVs  is  the  original  contribution  of 
this  thesis. 

I  observed  developmental  tests  of  SUAVs  in  the  course  of  writing  this  thesis.  I 
can  personally  attest  that  no  appropriate  methods  of  data  collection,  archival,  or  analysis 
are  currently  being  used,  and  that  these  methods  are  desperately  needed  by  the  SUAV 
community  if  it  is  to  progress  beyond  the  novelty  stage.  I  strongly  recommend  their 
adoption  by  NAVAIR. 

This  thesis  makes  an  initial  examination  of  the  real  problem  of  SUAV  reliability. 
Primarily  it  is  a  qualitative  approach,  which  illuminates  some  of  the  problem’s  aspects. 
Collecting  real  data  from  SUAV  systems  will  formulate  reliability  databases. 
Quantitative  reliability  analysis  may  then  follow  and  result  in  detailed  information  about 
reliability  improvement,  but  only  if  the  collection  systems  outlined  here  are  implemented 
to  provide  the  data  for  analysis. 


XX 


I.  INTRODUCTION 


A.  BACKGROUND  (UAVS,  SUAVS) 

1.  UAV  -  Small  UAV 

One  hundred  years  after  the  Wright  brothers’  first  suecessful  airplane  flight, 
aireraft  have  been  proven  invaluable  in  combat.  Unfortunately,  airplanes  have  also 
contributed  to  the  loss  of  operator  life.  Many  pilots  have  been  killed  attempting  to 
accomplish  their  mission,  to  become  better  pilots,  and  to  test  new  technologies.  The 
development  of  uninhabited  or  unmanned  aerial  vehicles  (UAVs)  raises  the  possibility  of 
more  efficient,  secure,  and  cost  effective  military  operations.! 

The  UAV  puts  eyes  out  there  in  places  we  don’t  want  to  risk 
having  a  manned  vehicle  operate.  Sometimes  it’s  very  dull,  but  necessary 
work — flying  a  pattern  for  surveillance  or  reconnaissance.  UAVs  can  go 
into  a  dirty  environment  where  there’s  the  threat  of  exposure  to  nuclear, 
chemical  or  biological  warfare.  They  are  also  sent  into  dangerous 
environments — ^battle  zones:  Dull,  Dirty,  Dangerous.  The  primary  reason 
for  the  UAV  is  the  Three  D’s.2 

The  history  of  UAVs  started  in  1883  when  Douglas  Archibald  attached  an 
anemometer  to  the  line  of  a  kite.  Archibald  managed  to  obtain  differential  measures  of 
wind  velocity  at  altitudes  up  to  1,200  feet.  In  1888,  Arthur  Batat  made  the  first  aerial 
photograph  in  France,  after  installing  a  camera  on  a  kite.  The  first  use  of  UAVs  built  for 
military  purposes  was  during  WWII  by  the  Germans.  The  well-known  flying  bombs  V-I 
and  V-II  showed  that  unmanned  aircraft  could  launch  against  targets  and  create  a 
destructive  effect.  In  the  1950s,  the  US  developed  the  Snark.  It  was  an  unmanned 
intercontinental  range  aircraft  designed  to  supplement  Strategic  Air  Command’s  manned 
bombers  against  the  Soviet  Union.  Snark,  V-I  and  V-II  destroyed  themselves  as  they  hit 
their  targets.  In  fact,  these  were  early  versions  of  today’s  cruise  and  ballistic  missiles. 3 

1  Clade,  Lt  Col,  USAF,  “Unmanned  Aerial  Vehieles:  Implieations  for  Military  Operations,”  July  2000, 
Oeeasional  Paper  No.  16  Center  for  Strategy  and  Teehnology,  Air  War  College,  Air  University,  Maxwell 
Air  Foree  Base. 

2  Riebeling,  Sandy,  Redstone  Rocket  Article,  Volume  51,  No. 28,  “Unmanned  Aerial  Vehieles,”  July 
17,  2002,  Col.  Burke  John,  Unmanned  Aerial  Vehiele  Systems  projeet  manager,  Internet,  February  2004. 
Available  at:  http://www.tuav.redstone.army.miFrsa_artiele.htm 

3  Carmiehael,  Bmee  W.,  Col  (Sel),  and  others,  “Strikestar  2025,”  Chapter  2,  “Historieal  Development 
and  Employment,”  August  1996,  Department  of  Defense,  Internet,  February  2004.  Available  at: 
http  ://www.au.afmiFau/2025/volume3/ehapl3/v3e  13-2.htm 
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In  the  US,  the  need  to  perform  reeonnaissanee  (RECCE)  missions  by  UAVs  came 
after  the  realization  that  these  missions  are  extremely  dangerous  and  mentally  fatiguing 
for  the  pilot.  The  U2  Dragon  Lady  planes  used  to  be  the  state-of-the-art  platforms  for 
RECCE  missions.  They  were  slow,  with  a  maximum  speed  of  0.6  Mach,  and  cruised  at 
an  altitude  between  70  and  90,000  feet.4  In  May  1960  the  Soviets  captured  a  U2  plane. 
The  pilot,  Gary  Powers,  confessed  to  the  black  bird  program,  created  by  President 
Eisenhower  to  monitor  the  development  of  Soviet  intercontinental  ballistic  missiles  after 
the  launch  of  Sputnik-I.  The  U2  flights  over  Russia  were  suspended.  Spy  satellites  fdled 
their  gap.  In  1962,  another  U2  was  hit  by  a  Soviet  anti-air  missile  while  on  a  RECCE 
mission  in  Cuba.  The  pilot  was  killed  in  the  crash.  As  a  result  of  these  incidents,  the  first 
unmanned  RECCE  “drone”,  the  AQM-34  Lighting  Bug,  was  made  by  the  Ryan 
Aeronautical  Company  in  1964.  The  term  “drone”  became  slang  among  military 
personnel  for  early-unmanned  vehicles.  It  was  a  byword  of  the  DH.82B  Queen  Bee, 
which  was  a  dummy  target  for  anti-aircraft  gunner  training. 5 

The  Lightning  Bug  was  based  on  the  earlier  Fire  Bee.  It  operated  from  1964  until 
April  1975,  performing  a  total  of  3,435  flight  hours  in  RECCE  missions  that  were  too 
dangerous  for  manned  aircraft,  especially  during  the  Vietnam  War.  Some  of  its  most 
valuable  contributions  were  photographing  prisoner  camps  in  Hanoi  and  Cuba,  providing 
photographic  evidence  of  SA-2  missiles  in  North  Vietnam,  providing  low-altitude  battle 
assessment  after  B-52  raids,  and  acting  as  a  tactical  air  launched  decoy.6 

In  1962,  Eockheed  Martin  began  developing  the  D-21  supersonic  RECCE  drone, 
the  Tagboard.  It  was  designed  to  be  launched  from  either  the  back  of  a  two-seat  A- 12, 
which  was  under  development  at  the  same  time,  or  from  the  wing  of  a  B-52H.  The  drone 
could  fly  at  speeds  greater  than  3.3  Mach,  at  altitudes  above  90,000  feet  and  had  a  range 


4  The  Global  Aircraft  Organization,  US  Reconnaissance,  “U-2  Dragon  Lady,”  Internet,  February  2004. 
Available  at:  http://www.globalaircraft.org/planes/u-2_dragon_lady.pl 

5  Clark,  Richard  M.,  Lt  Col,  USAF,  “Uninhabited  Combat  Aerial  Vehicles,  Airpower  by  the  People, 
For  the  People,  But  Not  with  the  People,”  CADRE  Paper  No.  8,  Air  University  Press,  Maxwell  Air  Force 
Base,  Alabama,  August  2000,  Internet,  February  2004.  Available  at:  http://www.maxwell.afmil 
/au/aupress/CADRE_Papers/PDF_Bin/clark.pdf 

6  Ibid. 
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of  3,000  miles.  The  projeet  was  canceled  in  1971  together  with  the  A- 12  development 
due  to  numerous  failures,  high  cost  of  operations,  and  bad  management.  7 

In  addition  to  the  RECCE  role.  Teledyne  Ryan  experimented  with  strike  versions 
of  the  BQM-34  drone,  the  Tomcat.  They  investigated  the  possibility  of  arming  the 
Lightning  Bug  with  Maverick  electro-optical-seeking  missiles  or  electro-optically-guided 
bombs  Stubby  Hobo.  Favorable  results  were  demonstrated  in  early  1972  but  the  armed 
drones  were  never  used  during  the  Vietnam  War.  Interest  in  the  UAVs  was  fading  by  the 
end  of  the  Vietnam  War.8 

In  the  1973  Yom  Kippur  War,  the  Israelis  used  UAVs  effectively  as  decoys  to 
draw  antiaircraft  fire  away  from  attacking  manned  aircraft.  In  1982,  UAVs  were  used  to 
obtain  the  exact  location  of  air  defenses  and  gather  electronic  intelligence  information  in 
Eebanon  and  Syria.  The  Israelis  also  used  UAVs  to  monitor  airfield  activities,  changing 
strike  plans  accordingly. 9 

2.  The  Pioneer  RQ-2io 

The  US  renewed  its  interest  in  UAVs  in  the  late  1980s  and  early  90s,  with  the 
start  of  the  Gulf  War.  Instead  of  developing  one  from  scratch,  the  US  acquired  and 
improved  the  Scout,  which  was  used  by  the  Israelis  in  1982  against  the  Syrians.  The 
outcome  was  the  Pioneer,  which  was  bought  by  the  Navy  to  provide  cheap  unmanned 
over  the  horizon  targeting  (OTHT),  RECCE,  and  battle  assessment.  The  Army  and 
Marines  bought  the  Pioneer  for  similar  roles  and  six  Pioneer  systems  were  deployed  to 
SW  Asia  for  Desert  Storm. 

Compared  to  the  Lightning  Bug,  the  Pioneer  is  slower,  larger,  and  lighter,  but 
cheaper.  The  average  cost  of  the  platform  was  only  $850K,  which  was  inexpensive 
relative  to  the  cost  of  a  manned  RECCE  aircraft.  1 1  With  its  better  sensor  technology,  the 

7  Carmichael. 

8  Ibid. 

9  Ibid. 

10  The  material  of  this  section  is  taken  (in  some  places  verbatim)  from  GlobalSecurity.org,  “Pioneer 
Short  Range  (SR)  UAV,”  maintained  by  John  Pike,  last  modified:  November  20,  2002,  Internet,  May  2004. 
Available  at:  http://www.globalssecurity.org/intell/systems/pioneer.htm 

11  National  Air  and  Space  Museum,  Smithsonian  Institution,  “Pioneer  RQ-2A,”  1998-2000,  revised 
9/14/01  Connor  R.  and  Lee  R.  E.,  Internet,  May  2004.  Available  at:  http://www.nasm.si.edu/research 
/aero/aircraft/pioneer.htm 
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Pioneer  can  deliver  real-time  battlefield  assessment  in  video  stream,  a  huge  improvement 
compared  to  the  fdm  proeessing  required  for  the  Lightning  Bugs. 

By  2000,  after  15  years  of  operations,  the  Pioneer  had  logged  more  than  20,000 
flight  hours.  Apart  from  Desert  Storm  it  was  used  in  Desert  Shield,  in  Bosnia,  Haiti, 
Somalia,  and  for  other  peacekeeping  missions.  The  Navy  used  the  Pioneer  to  monitor  the 
Kuwait  and  Iraqi  coastline  and  to  provide  spotting  services  for  every  16-ineh  round  fired 
by  its  battleships. 

Pioneer  can  give  detailed  information  about  a  loeal  position  to  a  battalion 
commander.  Joint  foree  commanders  wanted  to  see  a  bigger,  eontinuous  pieture  of  the 
battlefield,  but  spaee-based  and  manned-airborne  RECCE  platforms  could  not  satisfy 
their  demand  for  continuous  situational  awareness  information.  In  response  to  that  need 
and  in  addition  to  tactieal  UAVs  (TUAVs)  like  the  Pioneer,  the  US  began  to  develop  a 
family  of  enduranee  UAVs. 

Three  different  platforms  compose  the  enduranee  UAV  family:  Predator,  Global 
Hawk,  and  Dark  Star. 

a.  The  Predator  RQ-112 

Predator  is  a  by-product  of  the  CIA-developed  Gnat  750,  also  known  as 
the  Tierll  or  medium  altitude  enduranee  (MAE)  UAV.  It  is  manufaetured  by  General 
Atomics  Aeronautical  Systems  and  costs  about  $3.2M  to  $4.5M  per  platform.  13  Its 
endurance  was  designed  to  be  greater  than  40  hours  with  a  cruising  speed  of  110  knots 
and  operational  speed  of  75  knots  using  a  reciproeating  engine  with  a  25,000-foot  eeiling 
and  450-pound  payload.  Predator  can  carry  electro-optical  (EO)  and  infrared  (IR) 
sensors.  It  also  collects  full-rate  video  imagery  and  transmits  it  in  near  real-time  via 
satellite,  other  UAVs,  manned  aircraft  or  line-of-sight  (LOS)  data  link.  More  importantly. 
Predator  is  highly  programmable.  It  can  go  from  autonomous  flight  to  manual  control  by 
a  remote  pilot. 


12  The  material  for  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  Carmiehael. 

13  Ciufo,  Chris  A.,  “UAVs:New  Tools  for  the  Military  Toolbox,”  [66]  COTS  Journal,  June  2003, 
Internet,  May  2004.  Available  at:  http://www.eotsjoumalonline.eom/2003/66 
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Except  for  Pioneer,  Predator  is  the  most  tested  and  commonly  used  UAV.  It  was 
first  deployed  to  Bosnia  in  1994,  next  in  the  Afghan  War  of  2001,  and  then  in  the  Iraqi 
war  of  2003. 

Used  as  a  low  altitude  UAV,  Predator  can  perform  almost  the  same  tasks  as 
Pioneer:  surveillance,  RECCE,  combat  assessment,  foree  proteetion,  and  elose  air 
support.  It  ean  also  be  equipped  with  two  laser-guided  Hellfire  missiles  for  direet  hits  at 
moving  or  stationary  targets.  During  operation  Enduring  Ereedom  in  Afghanistan, 
Predators  were  considered  invaluable  to  the  troops  for  scouting  around  the  next  bend  of 
the  road  or  over  the  hill  for  hidden  Taliban  forces. 

Used  as  a  high  altitude  UAV,  the  Predator  can  perform  surveillance  over  a  wide 
area  for  up  to  30  to  45  hours.  In  Operation  Iraqi  Ereedom,  Predators  were  deployed  near 
Baghdad  to  attract  hostile  fire  from  the  eity’s  anti-air  defense  systems.  Once  the  loeations 
of  these  defense  systems  were  revealed,  manned  airplanes  eliminated  the  targets. 

b.  The  Global  Hawk  RQ-414 

A  TierII+  aircraft.  Global  Hawk  is  a  conventional  high-altitude  enduranee 
(CHAE)  UAV  by  Teledyne  Ryan  Aeronautieal.  A  higher  performance  vehiele,  it  was 
designed  to  fulfill  a  post-Desert  Storm  requirement  for  high  resolution  RECCE  of  a 
40,000  square  nautical  mile  area  in  24  hours.  It  ean  fly  for  more  than  40  hours  and  over 
3,000  miles  away  from  its  launch  and  recovery  base  carrying  a  synthetic  aperture  radar 
(SAR)  and  an  EO/IR  payload  of  2,000  pounds  at  altitudes  above  60,000  feet  at  a  speed  of 
340  knots.  The  cost  of  a  Global  Hawk  is  about  $57M  per  unit.i5 

c.  The  Dark  Star  R Q-346 

The  Tier  III  stealth  or  low  observable  high  altitude  enduranee  (EOHAE) 
RQ-3  UAV  was  the  Eockheed-Martin/Boeing  Dark  Star.  Its  primary  purpose  was  to 
image  well-protected,  high-value  targets.  Capable  of  operating  for  more  than  eight  hours 
at  altitudes  above  45,000  feet  and  a  distance  of  500  miles  from  its  launch  base,  it  was 
designed  to  meet  a  $10M  per  platform  unit  cost.  Its  first  flight  occurred  in  March  1996; 


14  The  material  for  this  section  is  taken  (in  some  places  verbatim)  from:  Carmichael. 

15  Ciufo. 

15  The  material  for  this  section  is  taken  (in  some  places  verbatim)  from:  Carmichael. 
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however,  a  seeond  flight  in  April  1996  erashed  due  to  ineorreet  aerodynamie  modeling  of 
the  vehicle  flight-control  laws.  The  project  was  cancelled  in  1999.17 

For  the  characterization  code  RQ-3  the  "R"  is  the  Department  of  Defense 
designation  for  reconnaissance;  "Q"  means  unmanned  aircraft  system.  The  "3"  refers  to  it 
being  the  third  of  a  series  of  purpose-built  unmanned  reconnaissance  aircraft  systems.  18 

3,  RQ-5  Hunter  19 

Initially  engaged  to  serve  as  the  Army’s  short  range  UAV  system  for  division  and 
corps  commanders  at  a  cost  of  $1.2M  per  unit, 20  the  RQ-5  Hunter  can  carry  a  200  lb  load 
for  more  than  1 1  hours.  It  uses  an  electro-optical  infrared  (EO/IR)  sensor,  and  relays  its 
video  images  in  real-time  via  a  second  airborne  Hunter  over  a  line-of-site  (LOS)  data 
link.  It  deployed  to  Kosovo  in  1999  to  support  NATO  operations.  Production  was 
cancelled  in  1999  but  the  remaining  low-rate  initial  production  (TRIP)  platforms  remain 
in  service  for  training  and  experimental  purposes.  Hunter  is  to  be  replaced  by  the  Shadow 
200  or  RQ-7  tactical  UAV  (TUAV). 

4,  RQ-7  Shadow  20021 

The  Army  selected  the  RQ-7  Shadow  200  in  December  1999  as  the  close  range 
UAV  for  support  to  ground  maneuver  commanders.  It  can  be  launched  by  the  use  of  a 
catapult  rail  and  recovered  with  the  aid  of  arresting  gear,  and  remain  at  least  four  hours 
on  station  with  a  payload  of  60  lbs. 

5,  RQ-8  Fire  Scout22 

The  RQ-8  Fire  Scout  is  a  vertical  take-off  and  landing  (VTOL)  tactical  UAV 
(VTUAV).  It  can  remain  on  station  for  at  least  three  hours  at  1 10  knots  with  a  payload  of 
200  lb.  Its  scouting  equipment  consists  of  an  EO/IR  sensor  with  an  integral  laser 


17  GlobalSecurity.org,  “RQ-3  Dark  Star  Tier  III  Minus,”  maintained  by  John  Pike,  last  modified: 
November  20,  2002,  Internet,  May  2004.  Available  at:  Available  at:  http://www.globalsecurity.org 
/intell/systems/darkstar.htm 

18  Ibid. 

19  The  material  for  this  section  is  taken  (in  some  places  verbatim)  from:  Office  of  the  Secretary  of 
Defense  (OSD),  “Unmanned  Aerial  Vehicles  Roadmap  2000-2025,”  April  2001,  page  4. 

20  Ciufo. 

21  The  material  for  this  section  is  taken  (in  some  places  verbatim)  from:  OSD  2001,  page  5. 

22  The  material  for  this  section  is  taken  (in  some  places  verbatim)  from:  OSD  2001,  page  5. 
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designator  rangefinder.  Data  is  relayed  to  its  ground  or  ship  control  station  in  real  time 
over  a  LOS  data  link  and  a  UHF  backup  that  could  operate  from  all  air  capable  ships. 

6,  Residual  UAVs  Systems23 

The  US  military  maintains  the  residual  of  several  UAV  programs  that  are  not 
current  programs  for  development  but  have  recently  deployed  with  operational  units  and 
trained  operators.  BQM-147,  Exdrones,  is  an  80-lb  delta  wing  communications  jammer 
and  was  deployed  during  the  Gulf  War.  From  1997  to  1998  some  of  them  were  rebuilt 
and  named  Dragon  Drone  and  deployed  with  Marine  Expeditionary  units.  Air  Force 
Special  Operations  Command  and  Army  Air  Maneuver  Battle  Lab  are  also  conducting 
experiments  with  Exdrones. 

Some  hand-launched,  battery  powered  FQM-151  Pointers  have  been  acquired  by 
the  Marines  and  the  Army  since  1989  and  were  employed  in  the  Gulf  War.  Pointers 
performed  as  test  platforms  for  various  miniaturized  sensors  and  have  performed 
demonstrations  with  the  Drug  Enforcement  Agency,  National  Guard  and  Special 
Operations  Eorces. 

7.  Conceptual  Research  UAV  Systems24 

The  various  service  laboratories  have  developed  a  number  of  UAVs  to  research 
special  operational  needs  and  concepts.  The  Marine  Corps  Warfighting  Laboratory  is 
exploring  three  such  concepts.  The  Dragon  Warrior  or  Cypher  II  is  intended  to  fly  over 
the  shore  on  fixed-wing  mode  flight  and  then,  after  removing  its  wings,  converts  into  a 
hovering  land  platform  design  for  urban  operations. 

Marines  have  converted  a  K-Max  helicopter  to  a  UAV  in  order  to  explore  the 
Broad  Area  Unmanned  Responsible  Resupply  Operations.  This  concept  is  for  ship-to- 
shore  resupply  by  UAVs. 

Battery-powered  Dragon  Eye  is  a  mini-UAV  (2.4  foot  wingspan  and  4  lbs) 
developed  as  the  Navy’s  version  for  the  Over-The-Hill  RECCE  Initiative  and  the 
Marines’  Interim  Small  Unit  Remote  Scouting  System  requirement.  The  Dragon  Eye  can 
be  carried  in  a  backpack,  and  hence  is  given  the  name  of  Backpack  UAV. 


23  Ibid,  page  6. 

24  Ibid,  pages  7-8. 
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Sponsored  by  the  Defense  Threat  Reduction  Agency,  the  Counterproliferation 
(CP)  Advance  Concept  Technology  Demonstrations  (ACTD)  envisions  deploying  several 
mini-UAVs  like  the  Finder  from  a  larger  Predator  UAV  to  detect  chemical  agents  and 
relay  the  results  back  through  Predator- 

The  CP  ACTD  is  designed  to  address  the  growing  need  to  provide 
a  military  capability  for  “precision  engagemenf’  of  weapons  of  mass 
destruction  (WMD)  related  facilities.  In  order  to  accomplish  this  objective, 
the  CP  ACTD  will  develop,  integrate,  demonstrate  and  transition  to  the 
warfighters,  operationally  mature  technologies  that  potentially  address  the 
unique  requirements  to  enhance  the  joint  counterforce  mission  to  hold 
WMD-related  facilities  at  risk.  The  driving  CP  counterforce  requirements 
include  enhancing  the  ability  to  predict  and  to  control  collateral  effects 
and  to  provide  prompt  response  and  reliable  kill.25 

Besides  the  Dragon  Eye  and  Finder  mentioned  above,  the  Naval  Research 
Laboratory  (NRL)  has  built  and  flown  several  small  and  micro-UAVs.  Definition  for 
these  airframes  will  follow.  The  Naval  Air  Warfare  Centre  Aircraft  Division 
(NAWC/AD)  maintains  a  small  UAV  test  and  development  team  and  also  operates 
various  types  of  small  UAVs. 

8.  DARPA  UAV  Programs26 

The  Defence  Advanced  Research  Projects  Agency  (DARPA)  is  sponsoring  five 
major  creative  UAV  programs: 

a.  The  Air  Force  X-45  UCAV,  which  was  awarded  to  Boeing  in  1999.  The 
mission  for  the  UCAV  is  Suppression  of  the  Enemy  Air  Defences  (SEAD).  The  platform 
will  cost  one  third  as  much  as  a  Joint  Strike  Tighter  (JSE)  to  acquire  and  one  quarter  as 
much  to  operate  and  support  (O&S).  The  X-45  A,  with  a  maximum  speed  of  lOOOkm/h, 
was  designed  to  carry  two  500  kg  bombs  using  radar  absorbing  materials,  and  was  first 
flown  in  June  2002. 

b.  The  UCAV-Navy  X-46/X-47  is  a  similar  program  for  the  equivalent 
Navy  version  of  a  UCAV  that  can  be  carrier-based.  Apart  from  SEAD  missions,  RECCE 
and  strike  will  be  among  the  platform’s  capabilities.  The  X-47A  Pegasus  by  Northrop 

25  Department  of  Defense,  Director  of  Operational  Test  &  Evaluation,  “Missile  Defense  and  Related 
Programs  FY  1997  Annual  Report,”  February  1998,  Internet,  February  2004.  Available  at: 
http  ://www .  fas .  org/ spp/ starwars/ program/ do  t  e9 7/9 7cp .  html 

26  The  material  for  this  section  is  taken  (in  some  places  verbatim)  from:  OSD  2001,  pages  8-9. 
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Grumman  successfully  flew  in  Mareh  2003  using  modified  GPS  coordinates  for 
navigation. 

c.  The  Advanced  Air  Vehicle  (AAV)  program  includes  two  rotorcraft 

projects: 

(1)  The  Dragon  Fly  Canard  Rotor  Wing,  which  will  demonstrate 
vertical  take-off-and-land  (VTOL)  capability  and  then  transition  to  fixed  wing  flight  for 
eruise. 

(2)  The  A160  Hummingbird,  which  uses  a  hingeless  rigid  rotor  to 
perform  high  endurance  flight  of  more  than  24  hours  at  a  high  altitude  of  more  than 
30,000  feet. 

d.  DARPA  is  exploring  various  designs  of  micro-air  vehicles  (MAVs), 
whieh  are  less  than  six  inches  in  any  dimension.  The  Lutronix  Kolibri  and  the  Mierocraft 
Ducted  Fan  rely  on  an  enclosed  rotor  for  vertical  flight,  while  the  Lockheed  Martin 
Sanders  Microstar  and  the  AeroVironment  Black  Window  and  E-Wasp  are  fixed-wing 
horizontal  fliers. 

9,  Other  Nation’s  UAVs 

In  FYOO  some  32  nations  manufaetured  more  than  150  models  of  UAVs,  and  55 
countries  operate  some  80  types  of  UAVs,  primarily  for  RECCE  missions. 

Derivatives  of  the  Israeli  designs  are  the  Crecerelle  used  by  the  Erench  Army,  the 
Canadair  CE-289  used  by  the  German  and  Erench  Armies  and  the  British  Phoenix.  The 
Russians  use  the  VR-3  Keys  and  the  Tu-300  and  the  Italians  the  Mirach  ISO.^t 

10.  NASA 

In  the  civilian  sector,  NASA  has  been  the  main  agency  concerned  with 
developing  medium  and  high-altitude  long  endurance  UAVs.  The  agency  has  been 
involved  with  two  main  programs  “Mission  to  Planet  Earth”  and  “Earth  Science 
Enterprise”  for  environmental  monitoring  of  the  effects  of  global  climatic  change.  During 
the  late  80s,  NASA  started  to  operate  high-altitude  manned  aircraft,  but  later  decided  to 
develop  a  UAV  for  high-altitude  operations.  NASA  constructed  the  propeller  driven 

27  Petrie,  G.,  Geo  Informatics,  Article  “Robotic  Aerial  Platforms  for  Remote  Sensing,”  Department  of 
Geography  &Topographic  Science,  University  of  Glasgow,  May  2001,  Internet,  February  2004.  Available 
at:  http://web.geog.gla.ac.uk/~gpetrie/12_17_petrie.pdf 
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Perseus  between  1991  and  1994  and  Theseus,  which  was  a  larger  version  of  Perseus,  in 
1996. 


In  1994  NASA  started  its  Environmental  Research  Aircraft  and  Sensor 
Technology  (ERAST)  program.  As  a  result,  NASA  has  operated  the  Altus  and  Altus  11 
since  1998.  Their  operating  ceilings  are  45,000  to  65,000  feet  using  turbocharged 
engines. 

The  development  of  solar  powered  UAVs  is  also  being  supported  and  funded  by 
NASA.  The  idea,  development,  and  construction  was  initiated  by  the  Aerovironment 
company,  which  has  been  involved  in  the  construction  of  solar-powered  aircraft  for  20 
years.  Solar  Challenger,  HAESOE,  Talon,  Pathfinder,  Centurion,  and  Helios  with  a 
wingspan  of  247  feet,  were  among  the  solar-powered  UAVs  during  those  efforts.28 

New  technologies  like  regenerative  fuel-cell-powered  UAVs  are  underway.  These 
allow  UAVs  to  fly  for  weeks  or  months,  reducing  the  costs  of  missions  so  as  to  deliver  a 
maximum  return  on  investment  per  flight.  NASA  will  also  support  the  development  of 
such  technology.29 

11,  What  Is  a  UAV?30 

The  distinction  between  cruise  missile  weapons  and  UAV  weapon  systems  is 
sometimes  confusing.  Their  main  differences  are: 

a.  UAVs  are  designed  to  be  recovered  at  the  end  of  their  flight  while  cruise 
missiles  are  not. 

b.  A  warhead  is  tailored  and  integrated  into  a  missile’s  airframe  while  any 
munitions  carried  by  UAVs  are  external  loads. 

According  to  1-02  DoD  Dictionary,  a  UAV  is 

A  powered,  aerial  vehicle  that  does  not  carry  a  human  operator, 

uses  aerodynamic  forces  to  provide  vehicle  lift,  can  fly  autonomously  or 

be  piloted  remotely,  can  be  expendable  or  recoverable,  and  can  carry  a 

28  Ibid. 

29  UAV  Rolling  News,  “New  UAV  work  for  Dryden  in  2004,”  June  12,  2003,  Internet,  February  2004. 
Available  at:  http://www.uavworld.eom/_disel/00000068.htm 

30  The  material  for  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  Offiee  of  the  Seeretary  of 
Defense  (OSD),  “Unmanned  Aerial  Vehieles  Roadmap  2002-2027,”  Deeember  2002,  Seetion  1, 
“Introduetion.” 
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lethal  or  non-lethal  payload.  Ballistic  or  semi  ballistic  vehicles,  cruise 
missiles,  and  artillery  projectiles  are  not  considered  unmanned  aerial 
vehicles. 31 

12,  Military  UAV  Categories 

UAVs  can  be  classified  according  to  different  criteria  such  as  mission  type,  sensor 
type,  performance,  and  control  system.  Remote  Piloted  Vehicles  (RPVs)  and  autonomous 
UAVs  are  two  distinct  groups  based  on  their  different  control  systems.  They  have  many 
common  features  but  the  main  difference  is  that  an  RPV  follows  the  data-link  commands 
of  a  remote  station  for  the  specific  air  mission.  In  other  words,  it  is  a  “dumb”  vehicle, 
which  can  carry  sensors  and  relay  data.  UAVs  can  be  further  classified  according  to  their 
mission  as  Reconnaissance  Surveillance  and  Target  Acquisition  (RSTA)  UAVs,  Combat 
UAVs  (UCAVs),  and  others.  According  to  the  way  they  are  launched,  they  can  be 
classified  as  hand-launched,  rail-launched,  rocket-launched  and  airfield-launched. 

We  also  classify  military  UAVs  in  three  main  categories,  considering  their  ceiling 
as  their  driver  characteristic:  Tactical  UAVs  (TUAVs),  Medium-Altitude  Endurance 
UAVs  (MAE  UAVs),  and  High-Altitude  Endurance  UAVs  (HAE  UAVs). 32 

a.  Tier  I  or  TUAVs  are  inexpensive  with  an  average  cost  of  100K$EY00, 
with  a  limited  payload  of  around  50  kg,  a  EOS  permitted  range  of  the  ground  control 
station,  and  endurance  of  approximately  four  hours.  In  general,  they  are  rather  small  with 
an  average  length  of  two  meters  and  their  maximum  ceiling  is  around  5,000  feet.  Pioneer 
is  a  typical  example.  This  category  is  also  referred  to  as  “Battlefield  UAVs”  and  can  be 
divided  in  three  subcategories: 

(1)  Micro  UAVs  (MUAVs)  are  very  small  UAVs  in  sizes  6  to  12 
inches.33  The  Aerovironment  Wasp  is  an  example  of  this  category.34 


31  Ibid. 

32  Tozer,  Tim,  and  others,  “UAVs  and  HAPs-Potential  Convergence  for  Military  Communications,” 
University  of  York,  DERA  Defford,  undated,  Internet,  February  2004.  Available  at:  http://www.elec.york 
.ac.uk/comms/papers/tozer00_ieecol.pdf 

33  Pike,  John,  Intelligence  Resource  Program,  “Unmanned  Aerial  Vehicles  (UAVS),”  Internet,  March 
2004.  Available  at:  http://www.fas.org/irp/program/collect/uav 

34  The  material  for  this  part  of  section  is  taken  (in  some  places  verbatim)  from:  OSD  2002,  Section2, 
“Current  UAV  programs.” 
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(2)  Mini  UAVs  have  a  span  up  to  four  feet.  They  provide  the 
eompany/platoon/squad  level  with  an  organie  RSTA  eapability  out  to  10  Km.  The 
Aerovironment  Dragon  Eye  is  an  example  of  this  oategory.35 

(3)  Small  UAVs  (SUAVs)  have  a  size  greater  than  four  feet  in 
length.  “SUAV  is  a  low-eost  and  user-friendly  UAV  system.”  It  is  a  highly  mobile  air 
vehicle  system  that  among  other  potentials  allows  the  small  warfighting  unit  to  set  the 
foundation  to  exploit  battlefield  information  superiority.36 

b.  Tier  II  or  MAE  UAVs  are  larger  than  TUAVs,  more  expensive,  with  an 
average  cost  of  1M$FY00,  and  have  enhanced  performance.  Their  payload  can  reach  300 
kg,  their  endurance  is  12  or  more  hours,  and  their  ceiling  is  up  to  20,000  feet.  Predator  is 
a  typical  example  of  a  MAE  UAV. 

c.  Tier  II  Plus  or  HAE  UAVs  can  be  large  craft  with  an  endurance  of  more 
than  24  hours,  payload  capacities  of  more  than  800  Kg  and  a  ceiling  of  more  than  30,000 
feet.  Their  average  cost  is  about  10M$FY00.  Global  Hawk  is  a  typical  example  of  HAE 
UAV. 


d.  Tier  III  Minus  or  LOHAE  UAVs  can  be  large  crafts  with  an  endurance 
of  more  than  12  hours,  payload  capacities  of  more  than  300  Kg,  and  a  ceiling  of  more 
than  65,000  feet.  Dark  Star  was  a  typical  example  of  LOHAE  UAV. 

13.  Battlefield  UAVs 

Here  are  two  descriptions  of  the  use  of  UAVs  in  training  and  combat. 
a.  Story  1.  Training  at  Fort  Bragg 

“FDC  this  is  FO  adjust  fire,  over”.  “FO  this  is  FDC  adjust  fire, 
ouf’.  “FDC  grid  304765,  over”.  “FO  grid  304765,  out”.  “FDC  two  tanks 
in  the  open,  over”.  “FO  that’s  two  tanks  in  the  open,  out”.  Then  about  30 
seconds  later,  “FO  shot,  over”.  “FDC  shot,  out”.  “FO  splash,  over”.  “FDC 
splash,  ouf’.  Fort  Bragg,  N.C.  (April  5,  2001). 

Communications  like  these  can  normally  be  heard  during  a  live-fire 
training  exercise  between  the  forward  observers  (FO)  and  the  Marines  at 
the  fire  direction  control  centre  (FDC),  but  during  exercise  Rolling 


35  Ibid. 

36  NAVAIR,  “Small  Unmanned  Aerial  Vehieles,”  undated,  Internet,  February  2004.  Available  at: 
http://uav.navair.navy.mil/smuav/smuav_home.htm 
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Thunder,  the  Battalion,  14**'  Marines  used  a  different  type  of  forward 
observer. 

Instead  of  a  few  Marines  dug  in  a  forward  position,  a  UAV 
eontrolled  by  the  Marines  from  the  Marine  fixed-Wing  Unmanned  Vehiele 
Squadron  2  (VMMU-2),  Cherry  Point,  N.C.,  gave  the  calls  for  fire. 

The  UAV  is  a  remote-controlled,  single-propeller  plane  with  a 
wing  span  of  17  feet  and  an  overall  length  of  14  feet.  Inside  the  body  of 
the  plane  is  a  camera  that  allows  the  pilots  to  see  and  to  identify  targets, 
according  to  Cpl.  Tim  Humbert,  team  non-commissioned  officer,  VMU-2. 

“This  was  an  excellent  training  opportunity  for  us,”  said  Capt. 
Konstantine  Zoganas,  battalion  fire  direction  officer,  Bn.,  14**'  Marines, 
Philadelphia,  Pa.  “There  aren’t  many  units  who  get  the  opportunity  to  train 
with  this  equipment.” 

For  this  mission,  the  UAV,  which  was  flying  at  around  6,000  to 
8,000  feet,  was  used  to  identify  targets.  They  then  looked  at  that  data  and 
turned  it  into  a  fire  mission,  which  was  sent  to  the  Marines  on  the  gun  line. 
Once  the  Marines  on  the  gun  line  blasted  their  round  toward  the  target,  the 
UAV  was  used  to  adjust  fire.  “After  using  the  UAV,  I  think  it  is  equal  to, 
if  not  better  than,  a  forward  observer,”  said  Zoganas.  “A  forward  observer 
has  a  limited  view  depending  on  where  he  is  at,  but  a  UAV,  being  in  the 
air,  has  the  ability  to  cover  a  lot  more  area,”  said  Zoganas.  “I  think  the 
UAV’s  capabilities  are  underestimated,  it  is  a  great  weapon  to  have  on  the 
modem  battlefield.”37 

2.  Story  2.  Desert  Shield/Storm  Anecdote 

Surrenders  of  Iraqi  troops  to  an  unmanned  aerial  vehicle  actually 
happened.  All  of  the  UAV  units  at  various  times  had  individuals  or  groups 
attempt  to  signal  the  Pioneer,  possibly  to  indicate  a  willingness  to 
surrender.  However,  the  most  famous  incident  occurred  when  USS 
Missouri  (BB  63),  using  her  Pioneer  to  spot  16-inch  gunfire,  devastated 
the  defences  of  Faylaka  Island  off  the  coast  of  Kuwait  City.  Shortly 
thereafter,  while  still  over  the  horizon  and  invincible  to  the  defenders,  the 
USS  Wisconsin  (BB  64)  sent  her  Pioneer  over  the  island  at  low  altitude. 
When  the  UAV  came  over  the  island,  the  defenders  heard  the  obnoxious 
sound  of  the  two-stroke  engine  since  the  air  vehicle  was  intentionally 
flown  low  to  let  the  Iraqis  know  that  they  were  being  targeted. 
Recognizing  that  with  the  “vulture”  overhead,  there  would  soon  be  more 
of  those  2,000-pound  naval  gunfire  rounds  landing  on  their  positions  with 
the  same  accuracy,  the  Iraqis  made  the  right  choice  and,  using 


37  Zachany,  Bathon  A.,  Marine  Forces  Reserve,  “Unmanned  Aerial  Vehicles  Flelp  3/14  Call  For  and 
Adjust  Fire,”  Story  ID  Number:  2001411104010,  April  5,  2001,  Internet,  February  2004.  Available  at: 
http://www.l3meu.usmc.miFmarinelink/mcn2000.nsf/Open  document 
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handkerchiefs,  undershirts,  and  bed  sheets,  they  signalled  their  desire  to 
surrender.  Imagine  the  consternation  of  the  Pioneer  aircrew  who  called  the 
commanding  officer  of  Wisconsin  and  asked  plaintively,  “Sir,  they  want  to 
surrender,  what  should  I  do  with  them?”38 

14,  Battlefield  Missions 

Reconnaissance  is  a  “mission  undertaken  to  obtain,  by  visual  or  other  detection 
methods,  information  about  the  activities  and  resources  of  an  enemy;  or  to  secure  data 
concerning  the  meteorological,  hydrographical  geographical  characteristics  of  a  particular 
area.”  This  task  is  about  gathering  general  information  about  an  enemy  or  an  area. 
Surveillance  is  the  “specific  and  systematic  observation  of  a  particular  area  or  target  for  a 
short  or  extended  period  of  time. ”39 

UAVs  have  been  used  for  the  above  missions  since  their  inception.  They  can  also 
be  used  for  target  acquisition,  target  designation  and  battle  damage  assessment  (BDA). 
Due  to  their  small  size,  they  can  operate  more  discreetly  than  their  manned  counterparts, 
allowing  target  acquisition  to  occur  with  less  chance  of  counter-detection.  “The 
surveillance  UAV  can  be  used  to  designate  the  target  for  a  precision  air  and/or  artillery  or 
missile  strike  while  providing  near  real-time  battle  damage  assessment  to  the  force  or 
mission  commander.  ”40  In  that  way,  useless  repeat  attacks  on  a  target  could  be  avoided  as 
well  as  wastage  of  munitions. 

Battlefield  UAVs  are  appropriate  UAVs  for  all  of  the  above  missions.  In  the 
beginning  of  the  1950s,  UAVs  like  the  Northrop  Falconer  had  been  developed  for 
battlefield  reconnaissance  with  little  or  no  combat  service.  Later  the  Israelis  were  the 
early  developers  of  the  operational  use  of  battlefield  UAVs  in  the  early  1980s  in  southern 
Lebanon  operations.  Their  successes  with  battlefield  UAVs  drew  international 
attention.4i 

38  The  Warfighter’s  Eneyelopedia,  Aireraft,  UAVs,  “RQ-2  Pioneer,”  August  14,  2003,  Internet, 
February  2004.  Available  at:  http://www.wre. ehinalake.navy.miFwarfighter  ene/airerafl/UAVs/pioneer 
.htm 

39  Ashworth,  Peter,  LCDR,  Royal  Australian  Navy,  Sea  Power  Centre,  Working  Paper  No6,  “UAVs 
and  the  Future  Navy”,  May  2001,  Internet,  February  2004.  Available  at:  http://www.navy.gov.au 
/spe/workingpapers/Working%20Paper%206.pdf 

40  The  material  for  the  above  part  of  seetion  is  taken  (in  some  plaees  verbatim)  from:  Ashworth. 

41  Goebel,  Greg,/  In  the  Publie  Domain,  “16.0]  US  Battlefield  UAVs  (1),”  Jan  1,  2003,  Internet, 
February  2004.  Available  at:  http://www.veetorsite.net/twuav6.html 
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We  can  distinguish  two  broad  categories  of  battlefield  UAVs;  the  “combat 
surveillance”  UAV  and  the  “tactical  reconnaissance”  UAV. 

a.  Combat  Surveillance  UA  Vs42 

The  function  of  combat  surveillance  UAVs  is  to  observe  everything  on  a 
battlefield  in  real-time,  flying  over  the  battle  area,  and  relaying  intelligence  to  a  ground- 
control  station.  In  general,  they  are  powered  by  a  small  internal  combustion  two-stroke 
piston  engine,  known  as  a  “chain  saw”  because  of  its  characteristic  noise.  An  autopilot 
system  with  a  radio  control  (RC)  back-up  for  manual  operations  directs  the  platform  from 
pre-takeoff  programmed  sets  of  waypoints.  In  most  cases,  the  program  is  set  up  by 
displaying  a  map  on  a  workstation,  entering  the  coordinates,  and  downloading  the 
program  into  the  UAV.  Navigation  is  always  verified  by  a  GPS  and  often  by  an  INS 
system  as  well. 

Combat  surveillance  UAVs  normally  use  the  autopilot  to  get  on  station 
(above  the  operating  area)  and  then  operate  in  manual  mode  by  RC  to  find  or  detect 
potential  targets.  As  a  result,  only  LOS  ranges  are  permitted,  due  to  the  limitations  of  the 
RC  transmitter  signals. 

Sensors  are  generally  housed  in  a  turret  underneath  the  platform  and/or  are 
integrated  into  the  platform’s  fuselage.  They  usually  feature  day-night  imagers  and  in 
many  case  a  laser  designator,  SIGINT  packages,  or  Synthetic  Aperture  Radar  (SAR). 

Larger  UAVs  have  fixed  landing  gear  that  are  used  for  takeoff  and  landing 
purposes  on  small  airstrips.  Larger  UAV  can  also  be  launched  by  special  rail  launcher 
boosters  and  recovered  by  parachute,  parasail  or  by  flying  into  a  net.  Smaller  UAVs  may 
be  launched  by  a  catapult  and  recovered  in  the  same  way  or  by  landing  in  plain  terrain 
without  any  use  of  landing  gear. 

b.  Tactical  Reconnaissance  UA  Vs43 

Tactical  Reconnaissance  (TR)  UAVs  are  usually  larger  and  in  some  cases 
jet  powered  with  extended  range  and  speed.  Like  the  combat  surveillance  UAVs,  they  are 
equipped  with  an  autopilot  with  RC  backup.  Their  primary  mission  is  to  fly  over 

42  The  material  for  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  Goebel. 

43  Ibid. 
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predefined  targets  out  of  line  of  sight,  and  take  pictures  or  relay  near  real-time  data  to  the 
ground-control  station  via  satellite  links. 


A  UAV  of  this  type  can  usually  carry  day-night  cameras  and/or  Synthetic 
Aperture  Radar  (SAR).  The  necessary  communication  equipment  is  usually  located  on 
the  upper  part  of  the  platform’s  fuselage.  A  TR  UAV  can  also  be  launched  from  runways 
or  small  airstrips,  an  aircraft,  and/or  by  special  rail  launcher  boosters,  and  be  recovered 
by  parachute. 

The  exact  distinction  between  the  two  types  of  battlefield  UAVs  and  other 
types  of  UAVs  is  not  clear.  Some  types  are  capable  of  both  missions.  A  small  combat 
surveillance  UAV  may  be  the  size  of  “a  large  hobbyist  RC  model  plane.”  It  can  be  “used 
to  support  military  forces  at  the  brigade  or  battalion  level  and  sometimes  they  are  called 
‘mini  UAVs.’  Their  low  cost  makes  them  suitable  for  ‘expendable’  missions.” 


B,  PROBLEM  DEFINITION 
1.  UAVs  Mishaps 

According  to  the  Office  of  Secretary  of  Defense  “UAV  Roadmap”  the  mishap  rate 
for  UAVs  is  difficult  to  define: 

Class  A  mishap  rate  (MR)  is  the  number  of  significant  vehicle 
damages  or  total  losses  occurring  per  100,000  hours  of  fleet  flight  time.  As 
no  single  U.S.  UAV  fleet  has  accumulated  this  amount  of  flying  time, 
each  fleet’s  MR  represents  its  extrapolated  losses  to  the  100,000-hour 
mark.  It  is  expressed  as  mishaps  per  100,000  hours.  It  is  important  to  note 
that  this  extrapolation  does  not  reflect  improvements  that  should  result 
from  operational  learning  or  improvement  in  component  technology.44 


A  Pentagon  report  said  that  crashes  and  component  failures  are 
increasing  the  cost  of  UAVs  and  restrict  their  availability  for  military 
operations.45 


44  OSD  2002,  Appendix  J,  page  186. 

45  Peck,  Michael,  National  Defense  Magazine,  May  2003,  Feature  Article,  “Pentagon  Unhappy  About 
Drone  Aircraft  Reliability,  Rising  Mishap  Rates  of  Unmanned  Vehicles  Attributed  to  Rushed 
Deployments,”  Internet,  February  2004.  Available  at:  http://www.nationaldefensemagazine.org/article. 
cfm?Id=1105 
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The  reliability  issue  has  sparked  controversy  and  concern  that  UAVs  are 
becoming  too  expensive.  There  is  a  wide-spread  notion  that  UAVs  are  simply 
expendables  and  cheap  vehicles,  something  like  diapers  that  are  used  once  and  discarded. 
The  truth  is  that  these  are  costly  components  of  expensive  systems. 

To  get  a  view  of  the  problem,  we  see  that  the  2002  crash  rate  for  Predator  was 
32.8  crashes  per  100,000  flight  hours,  and  for  2003  it  was  49.6  until  May.  The  accident 
rate  for  the  Global  Hawk  was  167.7  per  100,000  flight  hours  on  May  2003.46 

Nevertheless,  commanders  can  take  greater  risks  with  UAVs  without  worrying 
about  loss  of  life.  These  risks  would  not  be  taken  with  manned  aircrafts.  For  example,  the 
recently  updated  MR  for  the  F-16  was  3.5  per  100,000  flight  hours.  According  to  DoD 
data,  the  MR  for  the  RQ-2A  Pioneer  was  363  while  the  MR  for  the  RQ-2A  dropped  to 
139.  For  the  RQ-5  Hunter  it  was  255  for  pre-1996  platforms,  and  has  dropped  to  16  since 
then.  For  the  Predator  RQ-1  A,  it  was  43  and  for  the  RQ-IB  it  was  3 1 .47 

2.  What  is  the  Problem? 

Currently  a  network  experiment  series  named  Surveillance  and  Tactical 
Acquisition  Network  (STAN)  is  being  conducted  by  the  Naval  Postgraduate  School 
(NPS)  at  Camp  Roberts,  with  SUAVs  as  the  sensor  platforms  and  the  primary  source  of 
information.  SUAV  programs  are  currently  of  great  interest  to  the  Fleet,  Special  Forces, 
and  other  interested  parties  and  are  receiving  large  amounts  of  funding.  There  is  a  great 
deal  of  concern  about  the  reliability  of  SUAVs  because  a  lot  of  problems  have  emerged 
in  testing.  Reliability  must  be  improved. 

This  thesis  documents  these  problems.  At  the  CIRPAS  site  at  McMillan  Field  in 
Camp  Roberts  on  September  11  and  12,  2003,  I  observed  flight,  communication,  search 
and  detection,  and  target  acquisition  tests,  using  two  different  types  of  SUAV  platforms, 
XPV-IB  TERN  and  Silverfox,  an  experimental  program  funded  by  the  office  of  Naval 
Research.  Incidents  regarding  reliability  that  occurred  during  that  time  include: 


46  Peck,  Michael,  National  Defense  Magazine,  May  2003,  Feature  Article,  “Pentagon  Unhappy  About 
Drone  Aircraft  Reliability,  Rising  Mishap  Rates  of  Unmanned  Vehicles  Attributed  to  Rushed 
Deployments,”  page  1,  Internet,  February  2004.  Available  at:  http://www.nationaldefensemagazine.org 
/article. cfm?Id=l  105 

47  Peck. 
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a.  During  the  pre -takeoff  ehecks  in  the  runway  end,  an  engine  air-intake 
filter  failed  (due  to  broken  support  look  wire  hole).  The  problem  was  obviously  due  to 
engine  vibrations.  There  was  no  spare  part  filter  or  any  other  means  to  repair  the  failure, 
so  it  was  replaced  with  another  IKiW platform’s  air  filter. 

Result:  the  mission  was  delayed  for  thirty  minutes. 

b.  During  the  start  engine  procedure,  a  starting  device  failed.  The  failure 
was  due  to  a  loose  bolt  and  the  starting  device  could  not  start  the  engine.  After  ten 
minutes  delay,  the  bolt  was  tightened. 

Result:  the  procedure  was  delayed  for  ten  minutes. 

c.  After  two  and  a  half  hours  of  flight  operation  on  a  TERN  platform  and 
while  in  flight,  the  engine  stalled  at  500  feet.  The  SUAV  ran  out  of  fuel. 

Result:  loss  of  one  TERA  platform. 

d.  At  the  pre-takeoff  checks  on  a  Silverfox  platform,  recalibration  of  an 
engine’s  rpm  was  necessary  (probably  because  it  was  during  the  initial  flight  after 
replacing  the  old  engines  with  new). 

Result:  five-minute  delay. 

e.  During  the  operations  on  Silverfox  platforms,  many  bad  sensor  signals 
were  received  (especially  using  the  CCD  camera)  probably  due  to  ground-control  station 
antennas  or  due  to  LOS  constraints. 

Result:  Missions  lost  their  search  and  detection  capability 

f.  After  Silverfox’s  landings  (calculated  crashes)  in  the  field  (not  on  a 
runway),  extensive  cleaning  of  the  interior  of  the  platform  due  to  weeds,  soil  and  debris 
that  entered  the  vehicle  from  the  front  engine  opening  was  needed. 

Result:  At  least  twenty  minutes  cleaning  was  needed  after  such  landings. 

The  next  step  for  STAN  experiment  was  at  the  CIRPAS  site  at  McMillan  Field  in 
Camp  Roberts  from  May  2  to  May  6,  2004. 1  observed  flight,  communication,  search  and 
detection,  and  networking  tests,  using  the  XPV-IB  TERN  on  May  2  and  3.  Incidents 

regarding  reliability  that  occurred  during  that  time  include: 
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a.  During  the  assembly  ehecks  in  the  hangar  on  May  2,  major  software 
problem  was  deteeted.  Repairing  was  not  possible  by  the  team  members. 

Result:  the  platform  was  unable  to  operate  at  all. 

b.  During  the  test  flight  operation  of  the  next  platform  the  same  day,  the 
engine  stalled  at  1000  feet  and  led  to  a  platform  erash. 

Result:  loss  of  platform. 

e.  On  May  3,  after  one  hour  of  flight  operation,  the  third  platform  and 
while  in  flight,  an  autopilot  software  malfunction  was  occurred  that  led  to  a  platform  auto 
hard  landing  in  the  ground.48 

Result:  loss  of  one  more  lEiW  platform. 

d.  During  landing  of  the  next  TERN  platform  and  after  two  hours  in  flight 
operation  the  front  tire  delaminated  on  May  5.49  Probably  due  to  operator  error,  the 
damage  was  impossible  to  be  repaired  by  the  team  members. 

Result:  loss  of  platform. 

e.  On  May  6,  after  one  hour  of  operation  flight  and  while  in  flight,  a  right- 
wing  servo  failure  occurred  that  result  in  loss  of  platform  control  and  then  to  a  platform 

crash.  50 

Result:  loss  of  platform. 

3,  What  is  the  Importance  of  the  Problem? 

It  is  most  notable  that  SUAVs  are  not  technologically  sophisticated  enough  to 
warn  the  operator  that  the  vehicle  is  under  attack  and/or  under  critical  failure  (such  as  out 
of  fuel),  cannot  operate  under  unfavorable  weather  conditions,  and  have  a  low  level  of 
reliability,  which  degrades  their  role  in  military  operations.  Even  though  SUAVs  cost 
very  little  compared  to  other  systems,  such  as  observers,  helicopters,  planes  and  satellites, 
it  is  essential  that  small  UAV  missions  be  carried  out  with  an  acceptable  level  of 

48  Gottfried,  Russell,  LCDR  (USN),  Unmanned  Vehicle  Integration  TACMEMO,  5-6  May  Recap,  e- 
mail  May  7,  2004. 

49  Ibid. 

50  Ibid. 
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reliability,  operability,  and  reusability.  In  that  way,  they  can  become  dependable  systems 
and  be  used  in  the  battlefield  with  other  systems. 

4,  How  Will  the  Development  Teams  Solve  the  Problem  without  the 
Thesis? 

Trial  and  error  and/or  test,  analyze  and  fix  (TAAF)  are  the  methods  being  used  to 
overcome  failures  for  the  Silverfox  system.  Being  in  the  experimental  phase,  it  is  the 
easiest  but  most  time  consuming  way. 

For  the  other  system  {TERN)  that  has  been  operational  for  almost  two  years,  an 
extended  trial  period  is  presently  being  conducted.  From  it,  conclusions  can  be  made  for 
future  system  improvements  and  operational  usages.  Other  experimental  systems  can  also 
contribute  to  quantitative  assessments  of  readiness  and  availability. 

5.  How  Will  This  Thesis  Help? 

This  thesis  provides  a  tool  to  consider  reliability  issues  by  developing  a  system 
for  tracking  data  that  could  be  improve  reliability  for  SUAV  systems. 

6,  How  Will  We  Know  That  We  Have  Succeeded 

Verification  and  validation  of  the  proposed  solutions  and  methods  by  NAVAIR 
and  the  other  interested  parties  will  indicate  the  accuracy  and  the  effectiveness  of  the 
framework  suggested  by  this  thesis. 

7.  Improving  Reliability 

UAV  reliability  is  the  main  issue  preventing  the  FAA  from  relaxing  its 
restrictions  on  UAVs  flying  in  civilian  airspace  and  for  foreign  governments  to  allow 
overflight  and  landing  flights.  Improved  reliability  or  simply  knowing  actual  mishap  rates 
and  causes  will  enable  risk  mitigation  and  eventual  flight  clearance. 

Efforts  toward  improving  UAV  reliability  are  required,  but  how  can  this  best  be 
accomplished?  The  answer  is  by  spending  money,  but  we  can  be  more  specific.  More 
redundancy  of  flight  control-systems  may  increase  reliability,  but  there  is  another  trade 
off.  The  absence  of  components  needed  for  manned  aircraft  makes  UAVs  cheaper,  but 
this  also  degrades  their  reliability.  If  reliability  is  sacrificed,  then  high  attrition  will 
increase  the  number  of  UAVs  needed  and  so  the  cost  will  rise  again. 
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By  focusing  on  flight  control  systems,  propulsion,  and  operator  training,  whieh 
aeeount  for  approximately  80%  of  UAVs  mishaps,  we  ean  inerease  re  liability. 5 1 
Redundaney  in  on-board  systems  is  not  easily  added,  espeeially  to  small  UAVs.  Weight 
and  volume  restrietions  are  very  tight  and  that  ean  lead  to  expensive  solutions.  But  then  if 
we  make  UAVs  too  expensive,  we  eannot  afford  to  lose  them. 

We  ean  categorize  UAVs  by  their  volume,  by  their  usage,  by  their  enduranee  or 
by  their  capabilities  and  type  of  operations,  but  we  ean  also  view  eaeh  UAV  system  as  a 
unique  ease.  We  ean  analyze  the  system  aeeording  to  its  funetional  components  and  do  a 
Failure  Mode  and  Effeet  Analysis  (FMEA).  That  is  the  first  step  for  further 
implementation  of  a  reliability  traeking  and  improvement  method  sueh  as  ERACAS, 
Eailure  Mode  Effeet  and  Critieality  Analysis  (FMECA)  or  even  an  implementation  of 
MSG-3,  if  it  is  more  suitable.  I  diseuss  these  methods,  in  details,  later. 

Reliability  by  itself  is  a  measure  of  effeetiveness  (MOE).  In  order  to  keep  traek  of 
reliability  I  develop  some  measures  of  performanee  (MOP),  and  by  using  them  we  ean 
determine  the  results  of  our  reliability  eorrective  aetions,  if  any.  We  ean  also  keep  traek 
of  our  system’s  ability  to  be  maintained,  and  if  we  eonsider  the  operational  requirements 
and  logistic  data,  then  we  ean  evaluate  its  availability  as  well.  Definitions  and  a 
diseussion  of  reliability  are  ineluded  in  Appendix  D. 

8,  Area  of  Research 

This  study  provides  a  basis  for  eondueting  reliability  traeking  for  SUAVs  to 
improve  teehniques  and  methodologies  that  inerease  SUAVs  readiness.  To  aehieve  this, 
existing  methodologies  of  eontrolling  reliability,  EMEA  and  reliability  eentered 
maintenanee  (RCM)  with  maintenanee  steering  group-3  (MSG-3)  are  analyzed  and 
eompared.  Einally,  a  eritieality  analysis  provides  a  method  for  SUAV  operators  to 
aeeount  for  and  to  mitigate  risk  during  operations. 


51  Peck 
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II.  RELATED  RESEARCH 


A.  EXISTING  METHODS 

The  following  section,  presents  and  analyzes  existing  general  methods  of  failure 
tracking  and  analysis,  as  well  as  the  existing  reliability  centered  maintenance  method  that 
has  been  used  by  the  civil  aviation  industry.  A  comparison  between  them,  focusing  on 
small  UAV  (SUAV)  application,  is  also  presented. 

1.  General:  FMEA,  FMECA  and  ETA 

a.  Introduction  to  Failure  Mode  and  Effect  Analysis  (FMEA) 
Well-managed  companies  are  interested  in  preventing  or  at  least 
minimizing  risk  in  their  operations,  through  risk  management  analysis.  “The  risk  analysis 
has  a  fundamental  purpose  of  answering  the  following  two  questions: 

•  What  can  go  wrong? 

•  If  something  does  go  wrong,  what  is  the  probability  of  it  happening  and 
what  are  the  consequences?”52 

To  answer  these  questions,  previously  forensic  techniques  were  used. 
Today  the  focus  has  changed.  “The  focus  is  on  prevention.”53 

FMEA  is  one  of  the  first  systematic  techniques  for  failure  analysis.  “An 
FMEA  is  often  the  first  step  of  a  system’s  reliability  study. ”54  It  incorporates  reviewing 
components,  assemblies  and  subsystems  to  identify  failure  modes,  causes  and  effects  of 
such  failures.  FMEA  is  a  systematic  method  of  identifying  and  preventing  product  and 
process  failures  before  they  occur.  It  is  focused  on  preventing  defects,  enhancing  safety, 
and  increasing  customer  satisfaction. 


52  Stamatis,  D.  H.,  Failure  Mode  and  Effect  Analysis:  FMEA  from  Theory  to  Execution,  American 
Society  for  Quality  (ASQ),  1995,  page  xx.  The  above  part  of  section  is  a  summary  and  paraphrase  (in  some 
places  verbatim)  of  “Introduction.” 

53  Ibid,  page  xxi. 

54  Hoyland,  A.,  and  Rausand,  M.,  System  Reliability  Theory:  Models  and  Statistics  Methods,  New 
York:  John  Wiley  and  Sons,  1994,  page  73. 
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The  purpose  of  FMEA  is  preventing  proeess  and  product  problems  before 
they  occur.55  Used  in  the  design  and  manufacturing  process,  FMEAs  reduce  cost  and 
efforts  by  identifying  product  and  process  improvements  early  in  the  development  phase 
when  it  is  easier,  faster  and  less  costly  to  make  changes.  Formal  FMEAs  were  first 
conducted  in  the  aerospace  industry  in  the  mid  60’s,  when  looking  at  safety  issues. 
Industry  in  general  (automotive  particularly)  adapted  the  FMEA  for  use  as  a  quality 
improvement  tool. 

“FMEA  is  a  specific  methodology  to  evaluate  a  system,  design,  process  or 
service,  for  possible  ways  in  which  failures  (problems,  errors,  risks,  and  concerns)  can 
occur.”56  For  each  of  the  failures  identified,  an  estimate  is  made  for  its  occurrence, 
severity,  and  detection.  Then  an  evaluation  is  made  for  the  necessary  action  to  be  taken, 
planned,  or  ignored.  The  effort  focuses  on  minimizing  the  probability  of  failure  or  the 
effect  of  failure.  This  approach  can  be  technical  or  nontechnical.  Technical  is  the 
quantitative  way,  in  other  words,  the  way  in  which  we  determine,  express,  and  measure 
the  quantity  of  something.  Nontechnical  is  the  qualitative  way,  which  is  relative  to,  or 
involves  the  quality  of  something.  For  both  ways,  the  focus  is  on  the  risk  one  is  willing  to 
take.  In  that  way,  FMEA  becomes  a  systematic  technique  using  engineering  knowledge, 
reliability,  and  organizational  development  techniques. 57 
b.  Discussion 

FMEA,  as  a  qualitative  analysis,  is  better  carried  out  during  the  design 
stages  of  the  system.  “The  purpose  is  to  identify  design  areas  where  improvements  are 
needed  to  meet  reliability  requirements. ”58  It  provides  an  important  basis  for  design 
reviews  and  inspections.  It  can  be  carried  out  using  the  bottom-up  or  the  top-down 
approach.  With  the  bottom-up  approach  or  hardware  approach,  FMEA  starts  at  the 
component  level  and  expands  upward.  When  the  expansion  is  from  the  system  level 
downwards,  then  the  top-down  or  functional  approach  is  being  used.  Most  FMEA  are 

55  McDermott,  E.  R.,  Mikulak,  J.  R,  and  Beauregard,  R.  M.,  The  Basics  of  FMEA,  Productivity  Inc., 
1996,  page  4. 

56  Stamatis,  page  xxi. 

57  Stamatis,  page  xxii.  The  above  part  of  section  is  a  summary  and  paraphrase  (in  some  places 
verbatim)  of  “Introduction.” 

58  Hoyland,  page  74. 
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carried  out  according  to  the  bottom-up  approaeh.  However,  for  some  systems  adopting 
the  top-down  approach  can  save  time  and  effort.59 

In  order  to  have  a  formal  FMEA  proeess,  accurate  data  is  key.  Given 
aeeurate  data,  one  ean  make  the  proper  assumptions  and  ealeulations,  producing  an 
accurate  FMEA  process.  Accurate  data  presume  a  comprehensive  quality  system 
implementation.  Without  aeeurate  data  “on  a  product  or  process,  the  FMEA  beeomes  a 
guessing  game,  based  on  opinions  rather  than  aetual  facts”.  Implementing  a  quality 
system  assures  standard  procedures  and  proper  doeumentation  and  thus  yields  reliable 

data.60 

“The  basie  questions  to  be  answered  by  FMEA  are 

(1)  How  can  each  part  of  the  system  possibly  fail? 

(2)  What  meehanisms  might  produce  these  modes  of  failure? 

(3)  What  could  the  effects  be  if  the  failures  did  oecur? 

(4)  Is  the  failure  in  the  safe  or  unsafe  direetion? 

(5)  How  is  the  failure  detected? 

(6)  What  inherent  provisions  are  provided  in  the  design  to  compensate  for 

the  failures?”6i 

There  are  at  least  four  prerequisites  we  must  understand  and  must  consider 
while  condueting  FMEA: 

(1)  All  problems  are  not  the  same  and  not  equally  important. 

(2)  Know  the  eustomer  (end  user). 

(3)  Identify  the  function’s  purpose  and  objeetive. 

(4)  When  doing  an  FMEA,  it  must  be  prevention  oriented.  62 

59  Hoyland,  page  76.  The  above  part  of  section  is  a  summary  of  “Bottom-up  versus  Top-down 
Approach.” 

60  McDermott,  page  4.  The  above  part  of  section  is  a  summary  of  “Part  of  a  Comprehensive  Quality 
System.” 

61  Hoyland,  page  76. 

62  Stamatis,  page  xxii-xxiii. 
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Definitions  of  terms  related  to  failure  and  failure  modes  are  presented  in 


Appendix  C. 

c.  FMEA:  General  Overview 

For  a  system,  a  FMEA  “is  an  engineering  technique  used  to  define, 
identify  and  eliminate  known  and/or  potential  failures”  before  they  reach  the  end  user.  63 
A  FMEA  may  take  two  courses  of  action.  Eirst,  using  historical  data  there  may  be  an 
analysis  of  data  for  similar  products  or  systems.  Second,  inferential  statistics, 
mathematical  modeling,  simulations,  and  reliability  analysis  may  be  used  concurrently  to 
identify  and  define  the  failures.  A  EMEA,  if  conducted  properly  and  appropriately,  will 
provide  the  practitioner  with  useful  information  that  can  reduce  the  risk  load  in  the 
system.  It  is  one  of  the  most  important  early  preventive  actions  in  a  system,  which  can 
prevent  failures  from  occurring  and  reaching  the  user.  “EMEA  is  a  systematic  way  of 
examining  all  the  possible  ways  in  which  a  failure  may  occur.  Eor  each  failure,  an 
estimate  is  made  of  its  effect  on  the  system,  of  its  seriousness  of  its  occurrence,  and  its 
detection.”  As  a  result,  corrective  actions  required  to  prevent  failures  from  reaching  the 
end  user  will  be  identified,  thereby  assuring  the  highest  durability,  quality  and  reliability 
possible  in  the  system.  64 

d.  When  is  the  FMEA  Started?^^ 

As  a  methodology  used  to  maximize  the  end  user’s  satisfaction  by 
eliminating  and/or  reducing  known  or  potential  problems,  EMEA  must  begin  as  early  as 
possible,  even  if  all  the  facts  and  information  are  not  yet  known.  After  EMEA  begins,  it 
becomes  a  living  document  and  is  never  really  complete.  It  uses  information  to  improve 
the  system  and  it  is  continually  updated  as  necessary.  Therefore,  an  EMEA  should  be 
available  for  the  entire  system  life. 


63  Stamatis,  page  25. 

64  Stamatis,  page  26.  The  above  part  of  section  is  a  summary  and  paraphrase  (in  some  places  verbatim) 
of  “FMEA:  A  General  Overview.” 

65  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Stamatis,  page  29,  “When  is 
the  FMEA  Started?” 
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e.  Explanation  of  the  FMEA66 

Identification  and  prevention  of  known  and  potential  problems  from 
reaching  the  end  user  is  the  essence  of  an  FMEA  system.  One  of  the  assumptions  that 
must  be  made  is  that  problems  have  different  priorities.  Finding  or  setting  priorities  is 
important  because  that  is  the  main  issue,  which  drives  the  methodology.  Three 
components  help  define  the  priority  of  failures:  occurrence,  severity  and  detection. 

Occurrence  is  the  frequency  of  the  failure.  Severity  is  the 
seriousness  (effects)  of  the  failure.  Detection  is  the  ability  to  detect  the 
failure  before  it  reaches  the  customer.  To  define  the  value  of  these 
components,  the  usual  way  is  to  use  numerical  scales  called  risk-criteria 
guidelines.  These  guidelines  can  be  qualitative  and/or  quantitative. 67 

If  the  guideline  is  qualitative,  then  it  must  follow  the  theoretical  expected 
behavior  of  the  potential  component.  For  occurrence  the  expected  behavior  follows  a 
normal  distribution  because  frequencies  tend  to  be  like  that  over  time.  For  severity,  the 
expected  behavior  is  lognormal.  This  is  due  to  the  fact  that  failures,  which  do  occur, 
should  cause  annoyance,  and  they  are  not  usually  critical  or  catastrophic.  So  the  guideline 
should  follow  a  right-skewed  distribution.  For  detection,  the  expected  behavior  is  that  of 
a  discrete  distribution.  This  is  expected  due  to  the  fact  that  there  is  more  concern  if  the 
failure  is  found  by  the  end  user  than  finding  it  during  the  manufacturing  phase  in  the 
production  facilities.  So  the  guideline  should  follow  a  distribution  with  a  gap  between 
values. 

If  the  guideline  is  quantitative,  it  must  be  specific.  It  is  not  necessary  for 
the  guideline  to  follow  a  theoretical  distribution. 

Ranking  for  the  criteria  usually  has  a  value  based  on  1  to  10  scales.  It 
provides  ease  of  interpretation,  accuracy,  and  some  precision  in  the  quantification  of  the 
ranking.  Ranking  using  scales  from  1  to  5,  if  used,  offers  convenience  but  does  not  give 
an  accurate  “quantification  because  it  reflects  a  uniform  distribution. ”68 


66  The  material  from  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  Stamatis,  page  33, 
’’Interpretation  of  FMEA.” 

67  Stamatis,  page  33. 

68  Stamatis,  page  35. 
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The  failure’s  priority  is  represented  through  the  risk  priority  number 
(RPN),  which  is  the  product  of  occurrence  times  severity  times  detection.  The  value  of 
RPN  is  used  only  to  rank  order  the  concerns  of  the  system.  If  there  are  more  than  two 
failures  with  the  same  RPN,  we  first  address  the  failure  with  the  higher  severity  and  then 
with  the  higher  detection.  Severity  comes  first  because  it  has  to  do  with  the  effects  of  the 
failure.  Detection  is  next  because  user  dependency  is  more  important  than  the  failure 
frequencies. 

The  objective  for  product/design  FMEAs  is  to  reveal  product  problems 
that  will  result  in  safety  hazards,  malfunctions  or  shortened  product  life.  FMEAs  can  be 
conducted  at  each  phase  in  the  design  process  (initial  design,  prototype,  final  design)  or  at 
the  production  process  while  it  is  occurring.  “How  can  the  product  fail?”  is  the  basic 
question  asked  in  design  FMEAs. 69 

f.  The  Eight  Steps  Method  for  Implementing  FMEA 

The  eight  steps  of  the  method  are: 

(1)  Select  the  team:  The  team  should  be  “cross-functional  and 
multidisciplinary  and  the  team  members  must  be  willing  to  contribute.”  After  the  team 
has  been  identified,  it  prioritizes  the  opportunities  for  improvement. 

(2)  Do  the  functional  block  diagram:  The  first  step  for  every 
attempt  to  solve  any  problem  is  to  become  familiar  with  the  subject  to  ensure  that 
everyone  on  the  FMEA  team  has  the  same  understanding  of  the  process  and  the 
production  phases.  A  blueprint,  an  engineering  drawing,  or  a  flowchart  review  is 
necessary.  If  it  is  not  available,  the  team  needs  to  create  one.  Team  members  should  see 
the  product  or  a  prototype  and  walk  through  the  production  process  exactly.  A  block 
diagram  of  the  system  provides  an  overview  and  a  working  model  of  the  relationships 
and  interactions  of  the  system’s  subsystems  and  components. 


69  McDermott,  page  25.  The  above  part  of  section  is  a  summary  and  paraphrase  (in  some  places 
verbatim)  of  “Product/Design.” 

20  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from  Stamatis,  pages  42-44,  “The 
Process  of  Conducting  an  FMEA,”  and  McDermott,  pages  28-42,  “The  FMEA  Worksheet.” 
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(3)  Collect  data:  The  team  begins  to  collect  and  categorize  data. 
Then  they  should  start  filling  the  FMEA  forms.  The  failures  identified  are  the  failure 
modes  of  the  FMEA. 

(4)  Brainstorm  and  prioritize  potential  failure  modes:  Important 
issues  of  the  problem  are  recognized  by  the  team.  The  team  can  now  begin  thinking  about 
potential  failure  modes  that  could  affect  the  product  function,  quality  or  manufacturing. 
Brainstorm  sessions  place  all  ideas  out  on  the  table.  The  objective  is  to  create  dozens  of 
ideas.  The  ideas  should  be  organized  by  grouping  them  into  similar  categories.  Grouping 
can  be  done  by  the  type  of  failure,  (e.g.  mechanical,  electrical,  communication  etc)  or  the 
seriousness  of  the  failure.  At  that  step,  the  FMEA  team  reviews  the  failure  modes  and 
identifies  the  potential  effects  of  any  failure.  This  step  is  like  an  “if-then  statement” 
process.  If  that  failure  occurs,  then  what  are  the  consequences? 

(5)  Analysis:  Assign  a  severity,  occurrence  and  detection  rating  for 
each  effect  and  failure  mode.  The  sequence  from  data  to  information  to  knowledge  to 
decision  is  followed.  The  analysis  could  be  qualitative  or  quantitative  and  anything  may 
be  used  (cause  and  effect  analysis,  mathematical  modeling,  simulation,  reliability 
analysis  etc).  At  this  step,  severity,  occurrence,  and  detection  ratings  must  be  estimated. 
Those  ratings  are  based  on  a  lO-point  scale,  with  number  1  being  the  lowest  and  10  the 
highest  in  importance.  Establishing  clear  and  concise  descriptions  for  the  points  on  each 
of  the  scales  is  important  so  that  all  team  members  have  the  same  understanding  of  the 
ratings. 

(a)  The  severity  rating  estimates  how  serious  the  effect 
would  be  if  a  given  failure  did  occur.  Each  effect  should  be  given  its  own  severity  rating, 
even  if  there  are  several  effects  for  a  single  failure  mode. 

(b)  The  most  accurate  way  to  determine  the  occurrence 
rating  is  by  using  actual  failure  data  from  the  product.  When  this  is  not  possible,  failure 
mode  occurrence  must  be  estimated.  Knowing  the  potential  cause  of  failure  can  produce  a 
better  estimate.  Once  the  potential  causes  have  been  identified  for  all  of  the  failure 
modes,  an  occurrence  rating  can  be  assigned,  even  without  failure  data. 
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(c)  By  assigning  the  detection  rating,  we  estimate  how 
likely  we  are  to  detect  a  failure  or  the  effect  of  a  failure.  We  start  by  identifying  controls 
that  may  detect  a  failure  or  the  effect  of  a  failure.  In  case  there  are  no  controls,  the 
likelihood  of  detection  will  be  low  and  the  item  would  receive  a  high  rating  (9-10). 

(6)  Results:  Results  are  derived  from  the  analysis.  RPNs  must  be 
calculated  and  all  FMEA  forms  are  completed.  The  RPN  is  the  product  of  severity,  times 
occurrence,  times  detection  for  all  of  the  items.  The  total  RPN  is  the  sum  of  all  RPNs. 
This  number  is  used  as  a  metric  to  compare  the  revised  total  RPN  against  the  original 
RPN,  once  the  recommended  actions  have  been  introduced.  From  the  highest  RPN  to  the 
smallest,  we  can  now  prioritize  the  failure  modes.  A  Pareto  Chart  or  other  diagram  helps 
to  visualize  the  differences  between  the  various  ratings  and  enables  decision  regarding  on 
which  items  to  work.  Usually  it  is  useful  to  set  a  threshold  RPN  such  that  everything 
above  that  point  is  addressed. 

(7)  Confirm,  evaluate  and  measure:  After  the  results  have  been 
recorded,  confirmation,  evaluation,  and  measurements  of  the  success  or  failure  are  done. 
Using  an  organized  process,  we  can  identify  and  implement  actions  to  eliminate  or  reduce 
the  problem  of  high-risk  failure  modes.  It  is  very  common  to  manage  a  reduction  on  a 
high-risk  failure  mode.  After  doing  that,  we  refer  back  to  the  severity  occurrence  and 
detection  ratings.  Often  the  easiest  approach  to  make  a  process  or  product  improvement  is 
to  increase  detectability  of  the  failure,  thus  lowering  the  detection  rating.  This  is  not  the 
best  approach  because  increasing  failure-detectability  only  makes  it  easier  to  detect 
failures  once  they  occur.  Reducing  severity  is  important,  especially  in  situations  leading 
to  injuries.  The  best  way  for  improvement  is  by  reducing  the  likelihood  of  the  occurrence 
of  the  failure.  And  if  it  is  highly  unlikely  that  a  failure  will  occur,  there  is  less  need  for 
detection  measures.  Evaluation  answers  the  question:  “Is  the  situation  better,  worse  or  the 
same  as  before?” 

(8)  Do  it  all  over  again:  The  team  must  pursue  improvement  until 
the  failures  are  completely  eliminated,  regardless  of  the  answer  from  Step  7,  because 
FMEA  is  a  process  of  continual  improvement.  The  long-term  goal  is  to  eliminate  or 
mitigate  every  failure  completely.  The  short-term  goal  is  to  minimize  the  effects  of  the 
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most  serious  failures,  if  not  eliminate  them.  Onee  action  has  been  taken  to  improve  the 
product,  new  ratings  should  be  determined  and  a  resulting  RPN  calculated.  For  the  failure 
modes  that  have  been  corrected,  there  should  be  a  reduction  in  the  RPN.  Resulting  RPNs 
and  total  RPNs  can  be  organized  in  diagrams  and  compared  with  the  original  RPNs. 
There  is  no  target  RPN  for  FMEAs.  It  is  up  to  the  organization  to  decide  on  how  far  the 
team  should  pursue  improvements.  Failures  happen  sooner  or  later.  The  question  is  how 
much  relative  risk  the  team  is  willing  to  take.  The  answer,  again,  depends  on 
management  and  the  seriousness  of  failure. 

g,  FMEA  Team 

“A  team  is  a  group  of  individuals  who  are  committed  to  achieving 
common  organizational  objectives.”  They  meet  regularly  to  identify,  to  solve  problems, 
and  to  improve  processes.  They  work  and  interact  openly  and  effectively  together  and 
produce  the  desired  results  for  the  organization.  “Synergy,”  which  means  that  “the  sum  of 
the  total  is  greater  than  the  sum  of  the  individuals,”  is  the  characteristic  of  a  team.  7i 

“One  person  typically  is  responsible  for  coordinating  the  FMEA  process 
but  all  FMEAs  are  team-based.”  Team  members  “bring  a  variety  of  perspectives  and 
experiences  to  the  project.”  They  are  “formed  when  needed  and  disbanded”  after  the 
EMEA  is  completed.  72  The  first  priority  for  the  team  is  to  define  the  scope  of  EMEA.  A 
clear  definition  of  the  product  or  process  to  be  studied  should  be  written  and  understood 
by  all  team  members. 

h.  Limitations  Applying  FMEA 

(1)  “EMEA  analysis  may  be  very  effective  when  applied  to  a 
system  in  which  system  failures  most  probably  are  the  results  of  single-component 
failures.”  In  that  way,  “each  failure  is  considered  individually  as  an  independent 
occurrence.”  So,  an  EMEA  is  not  the  best  approach  for  analyzing  systems  with  a  fair 
degree  of  redundancy  (dependency).  Eor  such  systems,  a  Eault  Tree  Analysis  (ETA)  is  a 
better  alternative. 

71  Stamatis,  pages  85-88.  The  above  part  of  seetion  is  a  summary  and  paraphrase  (in  some  plaees 
verbatim)  of  “What  Is  a  Team?”  and  “Why  Use  a  Team?” 

72  MeDermott,  page  15.  The  above  part  of  seetion  is  a  summary  and  paraphrase  (in  some  plaees 
verbatim)  of  “The  FMEA  Team.” 

73  Hoyland,  page  80,  the  above  part  of  seetion  is  a  summary  of  “Applieations.” 
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(2)  FMEA  gives  inadequate  attention  to  human  errors  beeause  the 
focus  is  on  hardware  failures. 

(3)  The  amount  of  insignificant  work  that  must  be  done  is  also  a 
disadvantage.  Component  failures,  including  those  with  insignificant  consequences,  are 
examined  and  documented.  For  large  complex  systems  with  a  high  degree  of  redundancy, 
the  amount  of  trivial  and  unnecessary  work  is  huge. 

L  FMEA  Types 

Generally  there  are  four  types  of  FMEA:  System,  design,  process  and 
service.  In  the  SUAV  case,  we  deal  with  the  system  and  design  FMEA.  Failure  modes  are 
caused  by  system  deficiencies  in  the  functions  of  the  system.  Deficiencies  include 
interactions  among  subsystems  and  elements  of  the  system. 
j.  System  and  Design  FMEA^'* 

We  focus  on  system/design  FMEA  once  we  begin  to  analyze  the  reliability 
for  SUAVs.  A  system  FMEA  is  usually  accomplished  in  steps,  which  “include 
conceptual  design,  detailed  design  and  development,  and  testing  and  evaluation.” 
Establishing  a  system  FMEA,  uses  a  system  engineering  process  as  well  as  a  product 
development  methodology,  or  research  and  development,  or  a  combination  of  all  these. 
During  the  early  stages  of  development,  the  main  focus  is  to 

•  Turn  an  operational  need  into  a  demand  for  system  performance 
parameters  and  system  configuration  through  “the  use  of  an  interactive 
process.” 

•  “Integrate  related  technical  parameters  and  assure  compatibility  of 
physical,  functional,  and  program  interfaces”  optimizing  the  total  system. 

•  “Integrate  reliability,  maintainability,  engineering  support,  human  factors, 
safety,  liability,  security,  and  other  related  specialties  into  the  total 
engineering  effort.” 


74  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Stamatis,  pages  101-129, 
“System  FMEA,”  “Design  FMEA.” 
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The  first  step  in  condueting  the  system  FMEA  is  a  feasibility  study  to  find 
solutions  to  a  problem.  The  outcome  of  the  system  FMEA  is  an  initial  design  with  a 
baseline  configuration  and  operational  specifications. 

Design  EMEA  is  a  method  of  “identifying  potential  or  known  failure 
modes  and  providing  corrective  actions”  before  the  production  line  starts.  Initial  sample 
runs  or  prototype  runs  and  trial  runs  are  excluded.  The  milestone  for  the  first  production 
run  is  important  because  after  that  point  any  modification  and/or  change  in  the  design  it 
would  be  a  major  problem  due  to  the  amount  of  effort,  time  and  cost  required  to  do  the 
changes  in  that  stage.  The  design  EMEA  is  a  “dynamic  process”  involving  the 
implementation  of  numerous  “technologies  and  methods  to  produce  an  effective  design.” 
This  result  will  be  an  input  for  the  process,  and/or  the  service  EMEA. 

The  first  step  in  conducting  the  design  EMEA  should  be  a  “feasibility 
study  and/or  a  risk-benefit  analysis.”  The  objective  of  this  early  stage  is  to  optimize  the 
system,  which  means  to  maximize  the  system  quality,  reliability  and  maintainability,  and 
minimize  cost.  The  outcome  of  the  design  EMEA  is  a  preliminary  design,  which  can  be 
used  as  baseline  configuration  and  functional  specifications. 

k.  Analysis  of  Design  FMEA 

There  are  two  main  methods  of  design;  design-to-cost  and  design-to- 
customer  requirements.  In  the  first  approach,  the  main  goal  of  the  design  is  to  keep  costs 
within  a  certain  budget.  This  is  also  called  value-engineering  analysis  and  it  is  suitable 
for  commercial  products  with  minimum  safety  standards.  In  the  design-to-customer 
requirements  approach,  the  primary  designer’s  concern  is  to  satisfy  the  customer’s 
requirements  and  safety  and  regulatory  obligations.  This  is  common  for  products  related 
with  military  applications  and  with  high  safety  standards. 

A  design  EMEA  starts  with  two  requirements: 

•  Identifying  the  appropriate  form,  and 

•  Identifying  the  rating  guidelines 


75  Stamatis,  page  129-130. 
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The  form  and  the  rating  guidelines  for  the  design  FMEA  (or  any  kind  of 
FMEA)  are  not  standardized.  Eaeh  one  performing  EMEA  makes  his  own  forms  and 
rating  guidelines,  which  correspond  to  the  project’s  special  requirements  and 
characteristics,  as  well  as  the  designer’s  vision  and  experience. 

There  are  also  two  ways  that  the  rating  guidelines  can  be  formulated;  The 
qualitative  method  and  the  quantitative  method.  In  both  cases,  the  numerical  values  can 
be  from  1  to  5  or  1  to  10,  which  is  most  common. 

An  example  of  design  FMEA  form  is  in  Table  1.  The  form  is  divided  into 
three  parts.  The  first  part  with  item  numbers  from  1  to  10  is  the  introduction  part.  The 
second  part  of  the  form  includes  items  11  to  24  which  are  the  body  items  of  any  design 
FMEA.  The  third  part  items  25  and  26  concern  authority  and  responsibility  of  the  FMEA 
team.  Definition  of  terms  is  in  Appendix  A. 
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(1)  Subsystem  Name 

(4)  Supplier  Involvement 

(8)  FMEA  Date 

(2)  Design  Responsibility 

(5)  Model/Product 

(9)  FMEA  Revision  Date 

(2A)  Tbe  Head  of  tbe  System  Design  Team 

(6)  Engineering  Release  Date 

(10)  Part  Name 

(3)  Involvement  of  Others 

(7)  Prepared  by 

Page _ of _ Pages 

(11)  (12)  (13)  (14)  (15) 

Design  Potential  Potential  Critical  S 

Function  Failure  Effect(s)  Characteri  E 

Mode  of  sties  V 

Failure 

(16)  (17)  (18)  (19)  (20)  (21)  (22)  Actions  Results 

Potential  O  Detection  D  R  Recommended  Responsible  Area 

Cause(s)  C  Method  E  P  Action  or  Person  and  (24) 

of  Failure  C  T  N  Completion  Date  (23)  S  O  D  R 

Action  E  C  E  P 
Taken  V  C  T  N 

(25)  Approval  Signatures  (26)  Concurring  Signatures 


Table  1.  An  Example  of  Design  FMEA  (Erom  Stamatis,  page  131) 
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l.  FMEA  Conclusion 

Technology  can  develop  complex  systems  today.  UAVs  are  an  example  of 
the  increased  automation  built  into  a  complex  system.  To  be  able  to  develop  these 
systems  efficiently,  a  number  of  appropriate  system  development  processes  can  be  used. 
Implementing  such  a  process  from  the  early  stages  of  design  is  important  for  total 
development,  cost,  and  time. 

The  objective  of  a  FMEA  is  to  look  for  all  the  ways  a  system  or  product 
can  fail.  Failure  occurs  when  a  product  or  system  does  not  function  as  it  should,  or  when 
the  user  makes  a  mistake.  Failure  modes  are  ways  in  which  a  product  or  process  can  fail. 
Each  failure  mode  has  a  potential  effect.  Some  effects  are  more  likely  to  occur  than 
others.  Each  effect  has  a  risk  associated  with  it.  The  FMEA  process  is  a  way  to  identify 
failure  modes  effects  and  risks  within  a  process  or  product,  and  eliminate  or  reduce  them. 

The  most  important  reason  for  conducting  an  FMEA  is  the  need  to 
improve.  FMEAs  have  a  positive  impact  because  of  their  preventive  role.  The  purpose  of 
FMEA  is  preventing  system  and  product  problems  before  they  occur.  Elsed  in  the  design 
and  manufacturing  process,  they  reduce  cost  and  efforts  by  identifying  product  and 
system  improvements  early  in  the  development  phase  when  it  is  easier,  faster  and  cheaper 
to  make  changes. 

m.  Other  Tools^^ 

(1)  Fault  Tree  Analysis  (FTA).  This  is  a  reasoned-conclusion 
“analytical  technique  for  reliability  and  safety  analysis  used  for  complex  dynamic 
systems.”  It  provides  an  “objective  basis”  for  further  analysis  and  changes.  It  was 
developed  in  1961  by  Bell  Telephone  Company  and  is  widely  used  in  many  applications 
in  industry.  FTA  is  a  logical  tree  in  which  the  “various  combinations  of  possible  events” 
are  represented  graphically.  It  shows  the  “cause  and  effect  relationships”  between  a 
single  failure  and  its  causes.  At  the  top  of  the  tree  is  the  failure,  and  the  various 
contributing  causes  are  at  the  bottom  branches  of  the  tree.  “The  FTA  always  supplements 
the  FMEA.” 


76  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from  Stamatis,  pages  51-67, 
“Relationships  of  FMEA  and  Other  Tools.” 
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follows: 


This  thesis  develops  a  FTA  for  SUAVs.  FTA  process  outline 


(a)  Identify  the  system  fault  state(s)  or  undesired  events. 
The  top  event  must  be  quantifiable,  definable,  noticeable,  controllable  and  inclusive  from 
the  lower  events. 

(b)  Proceed  with  Fault  tree  construction.  Determine  the 
level  to  which  the  examination  should  be  conducted  and  fully  describe  all  events  that 
immediately  caused  this  event.  With  each  lower  level  fault,  describe  its  immediate  causes 
until  a  component  level  failure  or  human  error  is  exposed. 

(c)  Fault  tree  analysis  is  the  last  step  in  which  we  must 
determine  the  minimal  cut  sets  for  tree  simplification  and  the  probability  of  each  input 
event.  For  the  AND  logic  gates  the  probability  of  the  output  is  the  product  of  the  inputs 
probabilities  while  for  the  OR  logic  gates  it  is  the  sum  if  and  only  if  the  events  are 
mutually  exclusive.  Finally  we  must  determine  the  top  event  probability. 

(2)  Functional  flow  diagrams  or  block  diagrams  “illustrate  the 
physical  or  functional  relationships”  within  a  system  under  analysis.  They  are  used  to 
give  a  quick  and  comprehensive  view  of  the  system  design  requirements  illustrating 
series  and  parallel  relationships,  hierarchy  and  other  relationships  among  the  system’s 
functions.  The  types  of  block  diagrams  used  in  FMEA  are: 

(a)  System  Diagrams,  used  for  identifying  relationships 
between  major  components  and  other  system  components  in  large  systems  composed  of 
several  assemblies  or  subsystems, 

(b)  Detail  Diagrams,  used  for  identifying  relationships 
between  each  part  within  an  assembly  or  subsystem,  and 

(c)  Reliability  Diagrams,  used  for  identifying  the  series 
dependence  or  independence  of  major  components,  subsystems  or  detail  parts  in 
achieving  required  functions. 

(3)  FMA.  “Failure  mode  analysis  (FMA)  is  a  systematic  approach 

to  quantify  failure  modes,  failure  rate  and  root  causes  of  known  failures.”  FMA  is  based 
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only  on  historical  field  and  process  data.  It  is  a  diagnostic  tool  because  it  concerns  itself 
with  only  known  and/or  occurred  failures.  “Both  FMA  and  FMEA  deal  with  failure 
modes  and  causes.”  FMA  may  be  conducted  first  and  then  the  outcome  becomes  input  for 
the  FMEA. 

(4)  EMECA,  EAMECA.  An  FMEA  becomes  (EMECA  or 
EAMECA)  Eailure  Mode,  Effects  and  Critically  Analysis  if  criticalities  are  assigned  to 
the  failure  mode  effects. 77  An  analysis  like  that  identifies  any  faulty  components  in  the 
system  so  their  reliability,  or  safety  of  operation,  can  be  improved  early  enough  so  the 
designer  can  make  corrections  and  set  limitations  in  the  design.  EMECA  results  may  also 
be  useful  when  modifying  the  system  and  for  maintenance  planning.  In  a  complex  system 
all  components  cannot  be  redesigned.  The  most  critical  components  are  scientifically 
selected,  and  only  these  should  be  improved.  EMECA  is  usually  conducted  during  the 
design  phase  of  a  system. 

(5)  EMCA.  “Eailure  mode  and  critical  analysis  (FMCA)  is  a 
systematic  approach  to  quantify  failure  modes,  rates  and  root  causes  from  a  criticality 
perspective.78”  It  is  similar  to  the  FMEA  in  all  other  details.  An  EMCA  analysis  is  used 
“where  the  identification  of  critical,  major  and  minor  characteristics  is  important.”  By 
focusing  on  criticality  one  can  identify  the  single-point  failure  modes,  which  are  a  human 
error  or  hardware  failure  that  can  result  in  an  accident. 

(6)  QED.  Quality  function  deployment  (QFD)  is  a  systematic 
methodology  that  unites  the  various  working  groups  within  a  corporation  and  guides 
them  to  focus  on  customer’s  choices,  demands  and  expectations.  QFD  “encourages  a 
comprehensive,  holistic  approach  to  product  development.”  It  is  a  tool  that  interprets  the 
customer’s  requirements,  through  specific  characteristics,  manufacturing  operations  and 
production  requirements.  QFD  and  FMEA  have  much  in  common.  They  both  target 
continual  improvement  by  eliminating  failures  and  looking  for  customer  satisfaction. 
Usually,  QFD  occurs  first  and  based  on  the  results  FMEA  follows. 


77  Hoyland,  page  74. 

78  Stamatis,  page  62. 
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(7)  RCM.  79  Reliability-centered  maintenanee  (RCM)  has  its  roots 
in  the  aviation  industry. 80  Airlines  and  airplane  manufaeturers  developed  the  RCM 
proeess  in  the  late  1960’s.  The  initial  development  work  was  started  by  North  Ameriean 
eivil  aviation  industry.  The  airlines  at  that  time  began  to  realize  that  existing  maintenanee 
philosophies  were  not  only  too  expensive  but  very  dangerous  as  well.  In  1980,  an 
international  eivil  aviation  group  developed  an  inelusive  basis  for  different  maintenanee 
strategies.  This  basis  is  known  as  the  Maintenanee  Steering  Group-3  (MSG-3)  for  the 
aviation  industry. 8 1 

The  earliest  view  of  failure  in  the  1930’s  was  that  as  produets  aged, 
due  to  wear  and  tear,  they  were  more  likely  to  fail.  So  the  best  way  to  optimize  system 
reliability  and  availability  was  by  providing  maintenanee  on  a  routine  basis.  During 
World  War  II,  awareness  about  infant  mortality  led  to  the  widespread  belief  in  the 
“bathtub  curve”.  In  that  ease,  overhauls  or  eomponent  replaeements  should  be  done  at 
fixed  time  intervals  to  optimize  system  reliability  and  availability.  This  is  based  on  the 
assumption  that  most  systems  operate  reliably  for  a  period  of  “X”  and  then  wear  out. 
Keeping  reeords  on  failures  enables  us  to  determine  “X”  and  take  preventive  aetions  just 
before  deterioration  starts.  This  model  is  true  for  eertain  types  of  simple  systems  and 
some  eomplex  ones  with  age-related  failure  modes.  However,  after  1960,  due  to 
eomplexity  of  the  systems,  researeh  revealed  that  six  failure  patterns  actually  occur  in 
practice.  Data  eollection  and  analysis  will  enable  NAVAIR  to  determine  whieh  apply  to 
SUAVs. 

(a)  The  bathtub  curve.  It  begins  with  high 
oeeurrenee/ineidenee  of  failure,  whieh  is  the  infant  mortality,  followed  by  eonstant  or 
gradually  inereasing  eonditional  probability  of  failure,  and  ends  up  in  a  wear-out  zone 
due  to  age. 


79  The  material  from  this  subsection  is  taken  (in  some  places  verbatim)  from:  Aladon  Ltd,  Specialists 
in  the  application  of  Reliability-Centered  Maintenance,  “Reliability  Centred  Maintenance -An 
Introduction,”  Internet,  February  2004.  Available  at:  www.aladon.co.uk/10intro.html 

80  Floyland,  page  79. 

81  Aladon  Ltd,  Specialists  in  the  application  of  Reliability-Centered  Maintenance,  “About  RCM,” 
Internet,  February  2004.  Available  at:  www.aladon.co.uk/02rcm.html 
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(b)  Constant  or  slowly  increasing  conditional  probability  of 
failure,  ending  in  a  wear-out  zone. 

(c)  Slowly  increasing  conditional  probability  of  failure,  but 
no  recognizable  wear-out  zone. 

(d)  A  low  conditional  probability  of  failure  when  the  system 
is  new  and  then  a  rapid  increase  to  a  constant  level. 

(e)  A  constant  conditional  probability  of  failure  at  all  ages. 

(f)  A  high  infant  mortality  during  the  early  period  and  then 
constant  or  slowly  decreasing  conditional  probability  of  failure. 

The  above  six  failure  patterns  are  illustrated  in  the  next  figure. 


Figure  1.  The  Six  Failure  Patterns 
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The  idea  of  RCM  is  based  on  the  realization  that  what  users  want  depends 
on  the  operating  context  of  the  system.  So  RCM  is  “a  process  used  to  determine  what 
must  be  done  to  ensure  that  any  physical  asset  continues  to  do  what  its  users  want  it  to  do 
in  its  present  operating  context.”  The  RCM  process  asks  seven  questions  about  the 
system  under  review.  Any  RCM  process  should  ensure  that  all  of  the  following  seven 
questions  are  answered  satisfactorily  in  the  sequence  shown  below: 

•  What  are  the  functions  and  associated  desired  performance  standards  of 
the  system  in  its  present  operating  context?  (Functions). 

•  In  what  ways  can  it  fail  to  fulfill  its  functions?  (Functional  failures). 

•  What  causes  each  functional  failure?  (Failure  modes) 

•  What  happens  when  each  failure  occurs?  (Failure  effects). 

•  In  what  way  does  each  failure  matter?  (Failure  ramifications) 

•  What  should  be  done  to  predict  or  prevent  each  failure?  (Proactive  tasks 
and  task  intervals). 

•  What  if  a  convenient  solution  cannot  be  found?  (Default  actions) 

Definitions  of  terms  related  to  functions,  functional  failures,  failure 
modes,  and  failure  effects  are  presented  in  appendix  C. 

(8)  TAAF.  The  Test-Analyze  And  Fix  (TAAF)  philosophy  is 
accomplished  in  an  iterative  manner  by  conducting  tests,  collecting  data,  analyzing  data, 
making  the  appropriate  modifications  and  starting  the  tests  again.  The  process  starts  by 
conducting  tests  on  the  prototypes.  The  failure  data  are  collected  and  the  causes  are 
sought.  Corrective  actions  are  then  taken  to  reduce  the  occurrence  of  future  failures.  The 
same  process  is  repeated  until  the  tests  results  are  acceptable. 

Some  characteristics  of  TAAF  process  are 

•  All  failures  are  fully  analyzed. 

•  Actions  are  taken  in  the  design  and/or  production  phase  to  ensure  that 
failures  do  not  recur. 
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•  Tests  are  done  at  high  level  sinee  improvements  at  that  level  have  the 
maximum  effeet  on  system  reliability. 

•  Correetive  aetions  must  be  taken  as  soon  as  possible  on  all  eomponents  in 
the  development  program. 

In  general,  TAAF  is  a  time  eonsuming  and  eostly  reliability  growth 
proeess,  whieh  resembles  the  spiral  method  of  projeet  development.82 

(9)  FRACAS.  A  Failure  Reporting  Analysis  and  Correetive  Action 
system  (FRACAS)83  or  Data  Reporting  Analysis  and  Corrective  Action  System 
(DRACAS)  is  commonly  referred  as  a  “closed  loop  reporting  system.”  Implemented  for  a 
program  during  production,  integration,  testing,  and  field  deployment  phases,  it  allows 
for  the  collection  and  analyses  of  reliability  and  maintainability  data  for  the  hardware  and 
software  items.  For  a  successful  reliability  improvement  program,  all  failures  should  be 
considered.  Every  hardware  and  software  failure,  including  the  most  simplistic  ones,  such 
as  those  caused  by  loose  nuts  and  bolts  or  loose  cables,  should  be  investigated.  Corrective 
action  for  each  one  should  be  developed.  The  manufacturer  can  use  FRACAS  results  to 
“incorporate  the  corrective  actions  into  the  product.”  84  We  develop  a  FRACAS  for 
SUAVs  in  this  thesis. 

2.  Manned  Aviation  Specific:  RCM,  MSG-3 
a.  Introduction  to  RCM 

Reliability  centered  maintenance  (RCM)  originated  in  the  aviation 
industry  in  the  late  60s.  In  the  mid  70s,  the  US  Department  of  Defense  wanted  to  know 
more  about  aviation  maintenance.  As  a  result,  Stanley  Nowlan  and  Howard  Heap  of  the 
United  Airlines  wrote  a  report  titled  “Reliability  Centered  Maintenance.”  It  was 
published  in  1978,  and  it  is  still  one  of  the  most  important  documents  in  the  history  of 
physical  asset  management.85  RCM  is  “a  process  used  to  determine  what  must  be  done  to 

82  Blischke,  R.  W.,  and  Murthy  D.  N.  Prabhakar,  Reliability  Modeling,  Prediction,  and  Optimization, 
John  Wiley  &  Sons,  2000,  page  547-548. 

83  Pecht,  M.,  Product  Reliability  Maintainability  and  Supportability  Handbook,  CRC  Press,  1995, 
page  322. 

84  Ibid,  page  324. 
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ensure  that  any  physieal  asset  eontinues  to  do  what  its  users  want  it  to  do  in  its  present 
operating  eontext.” 

b.  The  Seven  Questions 

The  RCM  proeess  answers  seven  questions  about  the  system  under  review. 
Any  RCM  proeess  should  ensure  that  all  of  the  following  seven  questions  are  answered 
satisfaetorily  and  are  answered  in  the  sequenee  shown  below: 

(1)  What  are  the  funetions  and  assoeiated  desired  performanee 
standards  of  the  system  in  its  present  operating  eontext  (funetions). 

(2)  In  what  ways  ean  it  fail  to  fulfill  its  funetions?  (Funetional 

failures) 

(3)  What  eauses  eaeh  funetional  failure?  (Failure  modes) 

(4)  What  happens  when  eaeh  failure  oeeurs?  (Failure  effeets) 

(5)  In  what  way  does  eaeh  failure  matter?  (Failure  ramifieations) 

(6)  What  should  be  done  to  prediet  or  prevent  eaeh  failure? 

(7)  What  if  a  preventative  approaeh  eannot  be  found?  (Default 

aetions) 

While  defining  the  funetions  and  desired  standards  of  performanee  of  a 
system,  the  objeetives  of  maintenanee  are  defined.  Defining  funetional  failures  enables 
exaet  explanation  of  the  meaning  of  failure.  The  funetions  and  funetional  failures  were 
addressed  by  the  first  two  questions  of  the  RCM  proeess.  The  next  two  questions 
identified  the  failure  modes,  whieh  are  more  likely  to  eause  eaeh  funetional  failure,  and  to 
find  out  the  failure  effects  associated  with  each  failure  mode.  This  is  done  by  performing 
an  FMEA  for  each  functional  failure. 86 

c.  RCM-2^^ 

85  Moubray,  John  summarized  by  Sandy  Dunn,  Plant  Maintenance  Resource  Center,  “Maintenance 
Task  Selection-Part  3,”  Revised  September  18,  2002,  Internet,  May  2004.  Available  at:  http://www.plant- 
maintenance.com/articles  /maintenance_tak_selection_part2.shtml 

86  The  material  from  this  part  of  section  is  taken  (in  some  places  verbatim)  from:  Aladon  Ltd, 
“Introduction.” 

87  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Aladon  Ltd,  “About  RCM.” 
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Nowlan  and  Heap’s  report  and  MSG-3  have  been  used  as  a  basis  for 
various  military  RCM  standards  and  for  non-aviation  derivatives.  Of  these,  by  far  the 
most  widely  used  is  RCM-2. 

RCM-2  is  a  proeess  used  to  deeide  what  must  be  done  to  ensure  that  any 
physieal  asset,  system  or  proeess  continues  to  perform  exactly  as  its  user  wants  it  to.  The 
process  defines  what  users  expect  from  their  assets  in  terms  of 

(1)  Primary  performance  parameters  such  as  output,  throughput, 
speed,  range  and  carrying  capacity,  and 

(2)  Risk  (safety  and  environmental  integrity),  quality  (precision, 
accuracy,  consistency  and  stability),  control,  comfort,  containment,  economy,  customer 
service  and  so  on. 

The  second  step  in  the  RCM-2  process  is  to  identify  the  ways  the  system 
can  fail,  followed  by  an  FMEA  to  associate  all  the  events  that  are  likely  to  cause  each 
failure. 

The  last  step  is  to  identify  a  suitable  failure  management  policy  for  dealing 
with  each  failure  mode.  These  policy  options  may  include  predictive  maintenance, 
preventive  maintenance,  failure  finding,  or  changing  the  design  and/or  configuration  of 
the  system. 

The  RCM-2  process  provides  rules  for  choosing  which  of  the  failure 
management  policies  is  technically  appropriate  and  presents  criteria  for  deciding  the 
frequency  of  the  various  routine  tasks. 

(L  SAE  STANDARD  JA  1011 

RCM-2  complies  with  SAE  Standard  JA  1011  or  “Evaluation  Criteria  for 
Reliability-Centered  Maintenance  (RCM)  Process.”  It  was  published  in  August  1999  by 
the  Society  of  the  Automotive  Engineers  (SAE).  It  is  a  brief  document  setting  out  the 
minimum  criteria  that  any  process  must  include  to  be  called  an  RCM  process  when 
applied  to  any  particular  asset  or  system.  88 


88  The  material  from  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  Aladon  Ltd,  “About  RCM.” 
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The  standard  says  that  in  order  to  be  called  an  “RCM”  process,  a  process 
must  get  satisfactory  answers  to  the  seven  questions  above,  which  must  be  asked,  in  that 
particular  order.  The  rest  of  the  standard  identifies  the  information  that  must  be  gathered, 
and  the  decisions  that  must  be  made  in  order  to  answer  each  of  these  questions 
satisfactorily.  89 

MSG-3^^ 

In  July  1968,  Handbook  MSG-1,  “Maintenance  Evaluation  and  Program 
Development,”  was  developed  by  various  airlines  and  air  manufacturers’  representatives. 
Decision  logic  and  airline/manufacturer  procedures  for  scheduled  maintenance 
development  for  the  new  Boeing  747  were  the  main  part  of  the  document. 

In  the  I970’s  the  “Airline/Manufacturer  Maintenance  Program  Planning 
Document”  or  MSG-2  was  released.  It  was  a  universal  document  that  updated  the 
decision  logic  for  the  latest  aircraft. 

In  1979,  after  a  decade  of  MSG-2  implementation,  “experience  and  events 
indicated”  that  MSG  procedures  needed  updating.  In  addition,  new  generation  aircraft 
maintenance  requirements,  new  regulations  on  maintenance  programs,  the  high  price  of 
fuel  and  spare  parts  greatly  influenced  maintenance  program  development.  Various  areas 
that  where  “most  likely  candidates  for  improvemenf  ’  were  the  difficulty  of  the  decision 
logic,  the  clarity  of  the  difference  between  economic  and  safety  issues,  and  the 
effectiveness  of  the  hidden  functional  failures  solutions. 

With  the  participation  and  combined  efforts  of  the  Federal  Aviation 
Authority  (FAA),  Civil  Aviation  Administration  from  the  UK  (CAA/UK),  the  American 
Engineering  Association  (AEA),  US  and  European  aircraft  engine  manufacturers, 
airlines,  and  the  US  Navy  created  the  MSG-3  document. 


89  The  material  from  the  above  part  of  section  is  taken  (in  some  places  verbatim)  from:  Athos 
Corporation,  Reliability-Centered  Maintenance  Consulting,  “SAE  RCM  Standard:  JA  1011,  Evaluation 
Criteria  for  RCM  Process,”  Internet,  February  2004.  Available  at:  http://www.athoscorp.com/SAE- 
RCMStandard.html 

90  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Air  Transport  Association  of 
America,  “ATA  MSG-3,  Operator/Manufacturer  Scheduled  Maintenance  Development,  Revision  2002.1,” 
Nov  30,  2001,  pages  6-8. 
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Some  of  the  major  improvements  presented  by  MSG-3  as  compared  to 

MSG-2  were 

(1)  For  systems  and  powerplant  treatment: 

(a)  MSG-3  provides  a  “more  rational  procedure  for  task 
definition”  and  “linear  progression  through  the  decision  logic.” 

(b)  “MSG-3  logic  took  a  top-down  or  consequence  of 
failure  approach.”  At  the  beginning,  the  functional  failure  was  evaluated  for  the 
consequences  of  failure  and  was  assigned  one  of  two  basic  categories,  safety  or 
economic. 

(c)  Further  classification  established  sub-categories  based 
on  “whether  the  failure  was  evident  to  or  hidden  from  the  operating  crew.” 

(d)  “Task  selection  questions  were  arranged  in  a  sequence” 
so  that  the  “most  easily  accomplished  task,  was  considered  first.”  If  the  task  was  not 
applicable  or  effective,  then  “the  next  task  in  sequence  was  considered,  down  to  and 
including  possible  redesign.” 

(2)  Structures  treatment,  “fatigue,  corrosion,  accidental  damage, 
age  exploration”  and  other  considerations  were  incorporated  in  the  logic  diagram. 

(3)  “MSG-3  recognized  the  new  damage  tolerance  rules  and  the 
supplemental  inspection  programs  and  provided  a  method  by  which  their  purpose  could 
be  adapted  to  the  Maintenance  Review  Board  (MRB)  process  instead  of  relying  on  type 
data  certificate  restrains.”  The  MRB  is  discussed  in  Appendix  B. 

(4)  MSG-3  logic  was  “task-oriented  and  not  maintenance  process 
oriented.”  With  the  task-oriented  concept,  “one  would  be  able  to  view  the  MRB 
document  and  identify  the  initial  scheduled  maintenance  for  a  given  item.”  Definitions 
for  the  MRB  are  in  appendix  B. 

(5)  Servicing/lubrication  was  included  as  part  of  the  logic  diagram 
to  emphasize  its  severity. 
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(6)  Treatment  of  hidden  funetional  failures  was  more  thorough 
beeause  of  their  distinet  separation  from  the  evident  funetional  failures. 

(7)  “The  effeet  of  eoneurrent  or  multiple  failures  was  eonsidered.” 

(8)  “Struetures  deeision  logie  no  longer  eontained  a  speeifie 
numerieal  rating  system.” 

f.  MSGS  Revision^ ^ 

In  1987,  after  seven  years  of  MSG-3  experienee,  the  first  revision  was 
undertaken  and  released,  and  in  1993  revision  two  followed.  In  2001,  MSG-3  revision 
2001  was  ineorporated  and  in  2002,  revision  2002  was  issued  and  is  now  in  effeet. 

MSG-3  is  intended  to  faeilitate  the  development  of  initial 
scheduled  maintenance.  The  remaining  maintenance  (that  is  non- 
scheduled  or  non-routine  maintenance)  consists  of  maintenance  actions  to 
correct  discrepancies  noted  during  scheduled  maintenance  tasks,  other 
non-scheduled  maintenance,  normal  operation  or  data  analysis. 

The  analysis  process  identifies  all  scheduled  tasks  and  intervals  based  on 
the  aircraft’s  certificated  operating  capabilities. 

“The  management  of  the  scheduled  maintenance  development  activities” 
should  be  accomplished  by  an  Industry  Steering  Committee  (ISC),  which  consists  of 
members  from  representatives  of  operators,  and  prime  airframe  and  engine 
manufacturers.  “The  ISC  should  see  that  the  MSG-3  process  identifies  100% 
accountability  for  all  Maintenance  Significant  Items  (MSI’s)  and  Structural  Significant 
Items  (SSI’s).” 

An  MSI  is  an  item  that  has  been  identified  by  the  manufacturer  whose 

failure 

•  can  affect  ground  or  flight  safety,  and/or 

•  is  undetectable  during  operation  time,  and/or 

•  could  have  significant  operational  and/or  economic  impact.92 

91  The  material  from  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  ATA  MSG-3,  pages  9-13. 

92  ATA  MSG-3,  page  87. 
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A  SSI  is  any  “element  or  assembly,”  related  to  signifieant  flight,  ground, 
pressure  or  eontrol  loads.  An  SSI  failure  eould  affeet  the  struetural  integrity  of  the 

aireraft.93 

“One  or  more  working  groups,  eomposed  of  speeialist  representatives 
from  the  partieipating  operators,  the  prime  manufaeturer  and  the  Regulatory  Authority, 
may  be  eonstituted.”  The  ISC  will  approve  analyses,  teehnieal  data  and  information, 
whieh  will  be  “eonsolidated  into  a  flnal  report  for  presentation  to  the  Regulatory 
Authority.” 

g.  General  Development  of  Scheduled  Maintenance^'^ 

For  eaeh  new  type  of  aircraft,  it  is  necessary  to  develop  scheduled 
maintenance  prior  to  its  introduction  into  airline  service.  The  MSG-3  (revision  2002) 
document  has  the  primary  purpose  “to  develop  a  proposal  to  assist  the  Regulatory 
Authority  in  establishing  initial  scheduled  maintenance  tasks  and  intervals  for  new  types 
of  aircraft  and/or  powerplants.”  The  intention  is  to  maintain  and  to  enhance  the  inherent 
“safety  and  reliability  levels  of  the  aircraft.”  As  operating  experience  is  gained,  the 
operator  may  make  additional  adjustments  to  maintain  and  to  enhance  safety  and 
reliability. 

The  objectives  of  efficient  aircraft  scheduled  maintenance  are 

•  To  ensure  the  inherent  safety  and  reliability  levels  of  the  aircraft; 

•  “To  restore  safety  and  reliability  to  their  inherent  levels  when  deterioration 
has  occurred;” 

•  “To  obtain  the  information  needed  for  design  improvement  of  those  items 
whose  inherent  reliability  proves  insufficient;” 

•  To  achieve  the  above  goals  at  a  minimum  total  cost. 

From  the  above  objectives,  obviously,  scheduled  maintenance  can  only 
prevent  deterioration  of  inherent  levels.  If  the  inherent  levels  are  unsatisfactory,  then 
redesign  is  necessary  to  achieve  the  desired  safety  and  reliability  levels. 

93  Ibid,  page  89. 

94  The  material  from  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  ATA  MSG-3,  pages  14-16. 
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Scheduled  maintenanee  eonsists  of  two  groups  of  tasks: 

(1)  “A  group  of  scheduled  tasks  to  be  aeeomplished  at  speeified 
intervals.  The  objeetives  of  these  tasks  are  to  prevent  deterioration  of  the  inherent  safety 
and  reliability  levels  of  the  aireraft.”  They  may  inelude  lubrication/servieing  (LU/SV), 
operational/visual  cheek  (OPA^C),  inspection/functional  cheek  (IN/FC),  restoration  (RS) 
and  diseard  (DS). 

(2)  A  group  of  non-scheduled  tasks  that  result  from  the  seheduled 
tasks  aeeomplished  at  specified  intervals,  and  reports  of  malfunetions  usually  ereated  by 
the  operating  erew  and  data  analysis.  The  objeetives  of  these  tasks  are  to  bring  the  aireraft 
to  a  desired  eondition. 

An  effieient  program  schedules  only  those  tasks  neeessary  to  meet  the 
fixed  objeetives.  Additional  tasks,  whieh  will  inerease  eost  without  any  signifieant 
improvement  in  reliability,  are  not  seheduled.  The  MSG-3  doeument  “deseribes  the 
method  for  developing  the  seheduled  maintenanee”  using  a  “guided  logie  approaeh.”  The 
logie  flow  of  analysis  is  “failure-effeet  oriented”  while  the  result  must  be  a  task-oriented 
program.  Items  with  no  seheduled  task  speeified  may  be  monitored  by  an  operator’s 
reliability  program.  Finally,  assumptions  that  ean  result  in  a  ehange  must  be  doeumented. 

h.  Divisions  of  MSGS  Document^^ 

The  working  portions  of  MSG-3  are  eontained  in  four  seetions.  They  are  a 
section  for  System/Powerplant,  including  components  and  Auxiliary  Power  Units 
(APU’s);  a  seetion  for  aireraft  strueture;  a  seetion  for  zonal  inspeetion;  and  finally  a 
seetion  for  lightning/high  intensity  radiated  field  (L/HIRF)  analysis.  “Eaeh  seetion 
eontains  its  own  explanatory  material  and  deeision  logie  diagram,  and  it  may  be  used 
independently  of  other  MSG-3  seetions.” 

In  the  following  seetions  (i  through  p),  Aireraft  Systems/Powerplant 
Analysis  is  further  diseussed  beeause  it  obviously  has  the  elosest  potential  relationship 
with  SUAVs  applieations. 

L  MSI  Selection^^ 

95  The  material  from  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  ATA  MSG-3,  page  16. 

96  The  material  from  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  ATA  MSG-3,  pages  22-23. 
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Progressive  logie  diagram  is  the  evaluation  teehnique  applied  to  eaeh 
maintenanee  signifieant  item  (MSI)  using  the  teehnieal  data  available.  An  MSI  may  be  a 
system,  a  subsystem,  module,  eomponent,  aeeessory,  unit  or  part.  In  general,  the 
evaluations  are  based  on  the  item’s  funetional  failures  and  eauses  of  the  failure. 

Before  MSG-3  logie  ean  be  applied  to  an  item,  the  aireraft’s  signifieant 
systems  and  eomponents  must  be  identified.  Then  using  the  top-down  approaeh  MSIs 
must  be  identified.  To  seleet  MSIs,  the  proeess  is  as  follows: 

(1)  The  manufaeturer  divides  the  aireraft  into  the  main  funetional 
areas.  Air  Transport  Assoeiation  (AT A)  systems,  and  subsystems.  This  division  eontinues 
“until  all  the  aireraff  s  replaeeable  eomponents  have  been  identified.” 

(2)  “The  manufaeturer  establishes  the  list  of  items  to  whieh  MSI 
seleetion  questions  will  be  applied.” 

(3)  Those  questions  applied  to  the  items  in  the  lists  are 

(a)  “Could  failure  be  undeteetable  or  not  likely  to  be 
deteeted  by  the  operating  erew  during  normal  duties?”  (Deteetability) 

(b)  Could  failure  affeet  safety  on  ground  or  in  flight? 

(Safety  part  of  severity) 

(e)  “Could  failure  have  a  signifieant  operational  impaet?” 

(Operational  part  of  severity) 

(d)  “Could  failure  have  a  signifieant  eeonomie  impaet?” 

(Eeonomie  part  of  severity) 

(4)  Subsequent  analysis. 

(a)  If  at  least  one  of  the  above  four  questions  is  answered 
with  “yes,”  MSG-3  analysis  is  required.  “An  MSI  is  usually  a  system  or  subsystem,”  and 
in  most  oases  is  “one  level  above  the  lowest  level  identified”  on  (1).  “This  level  is 
oonsidered  the  highest  manageable  level;  i.e.  one  that  is  high  enough  to  avoid 
unneoessary  analysis,  but  low  enough  to  be  properly  analyzed.” 
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(b)  For  those  items  for  which  all  four  questions  are 
answered  with  a  “no,”  MSG-3  analysis  is  not  required.  “The  lower  level  items  should  be 
listed  to  identify  those  that  will  not  be  further  assessed.”  This  list  must  be  reviewed  and 
approved  by  the  Industry  Steering  Committee  (ISC). 

(5)  The  resulting  list  for  the  highest  manageable  level  items  is 
considered  the  “candidate  MSI  list”  and  is  presented  by  the  manufacturer  to  the  ISC.  The 
ISC  reviews  and  approves  this  list,  which  is  passed  to  the  working  groups  (WGs). 

(6)  The  WGs  review  the  candidate  MSI  list  in  order  “to  verify  that 
no  significant  items  have  been  overlooked,  and  that  the  right  level  for  the  analysis  has 
been  chosen.”  By  applying  MSG-3  analysis,  the  WGs  can  “validate  the  selected  highest 
manageable  level  or  propose  modification  of  the  MSI  list  to  the  ISC.” 

j.  Analysis  Procedure^^ 

For  each  MSI,  the  following  must  be  identified: 

•  Function(s),  the  “normal  characteristic  actions  of  an  item” 

•  Functional  Failure(s),  the  failure  of  an  item  to  perform  its  planned 

function(s) 

•  Failure  Effect(s),  the  result  of  a  functional  failure 

•  Failure  Cause(s),  the  reason  for  the  functional  failure  occurrence 

Analysis  should  take  special  care  to  “identify  the  functions  of  all 
protective  devices,”  and  include  economic  and  safety  related  tasks  in  order  to  “produce 
initial  scheduled  maintenance  tasks  and  intervals.”  Vendor  recommendations  (VR)  that 
are  available  should  be  “considered  and  discussed  in  the  WGs  meetings  and  accepted  if 
they  are  applicable  and  effective.” 

A  preliminary  work  sheet,  prior  to  applying  the  MSG-3  logic  diagram  to 
an  item,  clearly  defines  the  MSI,  its  function(s),  functional  failure(s),  failure  cause(s)  and 
additional  data  for  each  item. 

98 

k.  Logic  Diagram 

97  The  material  from  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  ATA  MSG-3,  pages  23-24. 

98  The  material  from  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  ATA  MSG-3,  pages  24-25. 
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The  decision  logic  diagram,  illustrated  in  Figure  2  and  3,  assists  in 
analyzing  systems  in  general  and  powerplant  items  in  particular.  The  logic  flow  follows  a 
top-down  approach  and  answers  the  “yes”  or  “no”  questions  giving  the  direction  of  the 
analysis  flow  after  each  answer. 

There  are  two  levels  in  the  decision  analysis: 

“(1)  Level  1  requires  the  evaluation  of  each  functional  failure  in 
order  to  determine  the  failure  effect  category;  i.e.  safety,  operational,  economic,  hidden 
safety  or  hidden  non-safety. 

(2)  Level  2  then  takes  the  failure  cause(s)  for  each  functional 
failure  into  account  for  selecting  the  specific  type  of  task(s).” 

In  Level  2,  regardless  of  the  answer  to  the  first  question  about 
lubrication/servicing  (LU/SV),  the  next  task  selection  question  must  always  be  asked. 
When  following  the  hidden  or  evident  safety  effects  path,  all  successive  questions  must 
be  asked.  In  the  remaining  categories  that  follow  the  first  question,  a  “yes”  answer 
permits  exiting  the  logic. 

Default  logic  concerns  areas  paths  that  do  not  affect  safety.  If  there  is  no 
“adequate  information  to  a  clear  ‘yes’  or  ‘no’  to  the  questions  in  the  second  level,  then 
default  logic  dictates  a  ‘no’  answer.”  “No,”  as  an  answer  in  most  cases,  provides  a  more 
conservative  and/or  costly  task. 
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Evident  functional  failure  YES 


Doesihe  functional 
failure  or  second 
/damage  as  a  resufr 
of  that,  have  a 
direct  adverse  effegf' 
\on  operating/ 
/safety?/ 


Is  the4>ccuh:ence 
a  functiona 
lure  evident  to  fh 
operating  crew 
during  the 
performance  of/ 
rmal  duties 


NO 


Level  1 


YES 


NO 


YES 


NO 


Level  2 


Safety  effects : 
Task(s)  required  to 
assure  safe 
operation 


Task  combination 
most  effective  must 
be  done 


Operational  effects : 
Task  desirable  if  it 
reduces  risk  to  an 
acceptable  level 


Economic  effects : 
Task  desirable  if  cost 
is  less  than  repair 
costs 


Same  as 
Operational 
effects 


Figure  2.  Systems  Powerplant  Logie  Diagram  Parti  (After  ATA  MSG-3,  page  18) 
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Hidden  functional  failure 


Figure  3. 
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Systems  Powerplant  Logic  Diagram  Part2  (After  ATA  MSG-3,  page  20) 
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1. 


Procedure 


This  procedure  requires  consideration  of  the  funetional  failures, 
failure  causes,  and  the  applicability  or  effectiveness  of  each  task.  Each 
functional  failure  processed  through  the  logic  will  be  directed  into  one  of 
five  failure  effect  categories:  99 

•  Safety 

•  Operational 

•  Economic 

•  Hidden  safety 

•  Hidden  non-safety  lOO 

m.  Fault  Tolerant  Systems  Analysis^^^ 

“In  MSG-3  analysis,  a  fault  tolerant  system  is  one  that  has  redundant 
elements  that  can  fail  without  impacting  safety  or  operating  eapability.”  These  faults  are 
not  very  noticeable  to  the  operating  crew  and  the  aircraft’s  safety  and  airworthiness  is  not 
impaired.  So,  “functional  failures,  in  fault  tolerant  systems,  are  hidden  non-safety.”  The 
“fault-tolerant”  faults  can  be  “detected  by  interrogation  of  the  system.” 

The  method  for  analyzing  MSIs  that  include  fault-tolerant  functions  has 
the  following  steps: 

•  “The  manufacturer  identifies  and  lists  all  functions,  highlighting  those  that 
are  fault-tolerant.” 

•  The  basis  for  identifying  fault-tolerant  functions  must  be  provided. 

•  “Eor  non-fault-tolerant  funetions,  the  standard  analysis  proeess  must  be 
used.” 

•  “Eor  fault-tolerant  functions,  the  WGs  must  determine  and  seleet  an 
applicable  and  effective  task  and  interval,  based  on  the  available  data  from 
the  manufacturer.” 

99  ATA  MSG-3,  page  25. 

100  Ibid,  page  21. 

101  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  ATA  MSG-3,  page  26. 
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102 

n.  Consequences  of  Failure  in  the  First  level 
There  are  four  first-level  questions. 

(1)  Evident  or  Hidden  Functional  Failure.  Question:  “Is  the 
occurrence  of  a  functional  failure  evident  to  the  operating  crew  during  the  performance  of 
normal  duties?” 

The  intention  for  this  question  is  to  separate  the  evident  from  the 
hidden  functional  failures.  The  operating  crew  is  the  pilots  and  air  crew  on  duty.  The 
ground  crew  is  not  part  of  the  operating  crew.  A  “yes”  answer  indicates  the  functional 
failure  is  evident  and  leads  to  Question  2.  A  “no”  answer  indicates  the  functional  failure 
is  hidden  and  leads  to  Question  3. 

(2)  Direct  Adverse  Effect  on  Safety.  Question:  “Does  the 
functional  failure  or  secondary  damage  resulting  from  the  functional  failure  have  a  direct 
unfavorable  effect  on  operating  safety?” 

A  direct  functional  failure  or  resulting  secondary  damage 
“achieves  its  effect  by  itself,  not  in  combination  with  other  functional  failures.”  If  the 
consequences  of  the  failure  condition  would  “prevent  the  continued  safe  flight  and 
landing  of  the  aircraft  and/or  might  cause  serious  or  fatal  injury  to  human  occupants,” 
then  safety  should  be  considered  as  unfavorably  affected.  A  “yes”  answer  indicates  that 
this  functional  failure  must  be  considered  within  “the  Safety  Effects  category”  and  task(s) 
must  be  developed  accordingly.  A  “no”  answer  indicates  the  effect  is  either  “operational 
or  economic”  and  leads  to  question  4. 

(3)  Hidden  Functional  Failure  Safety  Effect.  Question:  “Does  the 
combination  of  a  hidden  functional  failure  and  one  additional  failure  of  a  system  related 
or  back-up  function  have  an  adverse  effect  on  operating  safety?” 

This  question  is  asked  of  each  hidden  functional  failure,  identified 
in  Question  1.  A  “yes”  answer  indicates  that  there  is  a  “safety  effect  and  task 
development  must  proceed  in  accordance”  with  the  hidden-function  safety-effects 
category.  A  “no”  answer  indicates  that  there  is  a  “non-safety  effect  and  will  be  handled  in 
accordance”  with  hidden-function  non-safety  effects  category. 


102  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  ATA  MSG-3,  pages  26-30. 
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(4)  Operational  Effect.  Question:  “Does  the  functional  failure 
have  a  direct  unfavorable  effect  on  operating  capabilities?” 

In  this  question,  considerations  must  be  taken  concerning  the 
operating  restrictions,  correction  prior  to  further  dispatch,  and  abnormal  or  emergency 
procedures  from  the  flight  crew.  A  “yes”  as  an  answer  means  that  the  effect  of  the 
functional  failure  has  an  unfavorable  effect  on  operating  capability,  and  task  selection 
will  be  handled  in  evident  operational  effects  category.  A  “no”  as  an  answer  means  that 
there  is  an  economic  effect  and  should  be  handled  in  accordance  with  evident  economic 
effects  category. 

103 

o.  Failure  Effect  Categories  in  the  First  Level 
After  the  analysts  have  answered  the  applicable  first-level  questions,  “they 
are  directed  to  one  of  the  five  effect  categories.” 

(1)  Evident  Safety:  The  Evident  Safety  Effect  category  concerns 
the  safety  operation  assurance  tasks.  “All  questions  in  this  category  must  be  asked.”  In 
case  no  effective  task(s)  results  from  this  category  analysis,  “redesign  is  mandatory.” 

(2)  Evident  Operational:  In  this  category,  a  task  is  “desirable  if  it 
reduces  the  risk  of  failure  to  an  acceptable  level.”  Analysis  requires  the  first  question 
(EU/SV)  to  be  answered  and  regardless  of  the  answer,  to  proceed  to  the  next  level 
question.  Erom  that  point  a  “yes”  as  an  answer  completes  the  analysis  and  “the  resultant 
task(s)  will  satisfy  the  requirements.”  If  all  answers  are  “no,”  then  no  task  has  been 
generated  and  if  operational  penalties  are  severe,  redesign  may  be  desirable. 

(3)  Evident  Economic:  In  that  category,  a  task(s)  is  desirable  if  its 
cost  is  less  than  the  repair  cost.  Analysis  has  the  same  logic  as  the  operational  category.  If 
all  answers  are  “no,”  then  no  task  has  been  generated  and  if  economic  penalties  are 
severe,  a  redesign  may  be  desirable. 

(4)  Hidden  Safety:  “The  hidden  function  safety  effect  requires  a 
task(s)  to  assure  the  availability  necessary  to  avoid  the  safety  effect  of  multiple  failures.” 
All  questions  must  be  asked  and  “if  there  are  no  tasks  found  effective,  then  redesign  is 
mandatory.” 


103  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  ATA  MSG-3,  pages  31-38. 
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(5).  Hidden  non-Safety:  “The  hidden  function  non-safety  category 
indicates  that  a  task(s)  may  be  desirable  to  assure  the  availability  necessary  to  avoid  the 
economic  effects  of  multiple  failures.”  Analysis  has  the  same  logic  as  the  operational 
category.  If  all  answers  are  “no,”  no  task  has  been  generated  and  if  economic  penalties 
are  severe,  a  redesign  may  be  desirable. 

p.  Task  Development  in  the  Second  level^^"^ 

For  each  of  the  five-effect  categories,  task  development  is  used  in  a  similar 
manner.  “It  is  necessary  to  apply  the  failure  causes  for  the  functional  failure  to  the  second 
level  of  the  logic  diagram”  for  the  task  resolution  as  in  Table  2.  There  are  six  possible 
task  follow-on  questions  in  the  effect  categories. 

(1)  Lubrication/servicing  (in  all  categories).  Question:  “Is  the 
lubrication  or  servicing  task  applicable  and  effective?” 

“Any  act  of  lubrication  or  servicing  for  the  purpose  of  maintaining 
the  inherent  design  capabilities”  is  considered. 

(2)  Operational/visual  check  (hidden  functional  failure  categories 
only).  Question:  “Is  a  check  to  verify  operation  applicable  and  effective?” 

“The  operational  check  is  a  task  to  determine  that  an  item  is 
fulfilling  its  intended  purpose.”  It  is  a  failure-finding  task  and  does  not  require 
quantitative  tolerances.  “A  visual  check  is  an  observation  to  determine  that  an  item  is 
fulfilling  its  intended  purpose.”  It  is  also  a  failure-finding  task  and  does  not  require 
quantitative  tolerances. 

(3)  Inspection/functional  check  (All  categories).  Question:  “Is  an 
inspection  or  functional  check  to  detect  degradation  of  function  applicable  and 
effective?” 

An  inspection  could  be  general  and  visual,  detailed  with  surface 
cleaning  or  elaborate  access  procedures,  special  detailing  with  excess  surface  cleaning 
and  substantial  access  and  disassembly  procedures.  “A  functional  check  is  a  quantitative 
check  to  determine  if  one  or  more  functions  of  an  item  performs  within  specified  limits.” 


104  jiie  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  ATA  MSG-3,  pages  31-47. 
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(4)  Restoration  (All  categories).  Question:  Is  a  restoration  task  to 
reduce  the  failure  rate  applicable  and  effective? 

Restoration  is  the  “work  necessary  to  return  the  item  to  a  specific 
standard.”  The  scope  of  each  assigned  restoration  task  has  to  be  clearly  specified. 

(5)  Discard  (All  categories).  Question:  Is  a  discard  task  to  avoid 
failures  or  reduce  the  failure  rate  applicable  and  effective? 

Discard  is  the  “removal  from  service  of  an  item  at  a  specified  life 
limit.”  It  is  a  typical  task  applied  to  single  celled  parts  such  as  cartridges,  canisters, 
filters,  engine  disks,  etc. 

(6)  Combination  (Safety  categories  only).  Question:  Is  there  a 
task  or  combination  of  tasks  applicable  and  effective? 


question. 


All  possible  paths  must  be  analyzed  since  this  is  a  safety  category 
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Task 

Applicability 

Safety 

Effectiveness 

Operational 

Effectiveness 

Economic 

Effectiveness 

Lubrication 
or  Servicing 

The  replenishment 
of  the  consumahle 
must  reduce  the 
rate  of  functional 
deterioration. 

The  task  must 
reduce  the  risk  of 
failure. 

The  task  must 
reduce  the  risk 
of  failure  to  an 
acceptable  level. 

The  task  must  be 
cost  effective  (i.e., 
the  cost  of  the 
task  must  be  less 
than  the  cost  of 
the  failure 
prevented) 

Operational 
or  Visual 
Check 

Identification  of 
failure  must  he 
possible. 

The  task  must 
ensure  adequate 
availability  of  the 
hidden  function 
to  reduce  the  risk 
of  a  multiple 
failure. 

No  applicable. 

The  task  must 
ensure  adequate 
availability  of  the 
hidden  function, 
to  avoid  economic 
effects  of  multiple 
failures  and  must 
be  cost  effective. 

Inspection 

or 

Functional 

Check 

Reduced  resistance 
to  failure  must  he 
detectable,  and 
there  exists  a 
reasonably 
consistent  interval 
between  a 
deterioration 
condition  and 
functional  failure. 

The  task  must 
reduce  the  risk  of 
failure  to  assure 
safe  operation 

The  task  must 
reduce  the  risk 
of  failure  to  an 
acceptable  level. 

The  task  must  be 
cost  effective 

Restoration 

The  item  must 
show  functional 
degradation 
characteristics  at 
an  identifiable  age, 
and  a  large 
proportion  of  units 
must  survive  to 
that  age.  It  must  be 
possible  to  restore 
the  item  to  a 
specific  standard 
of  failure 
resistance. 

The  task  must 
reduce  the  risk  of 
failure  to  assure 
safe  operation. 

The  task  must 
reduce  the  risk 
of  failure  to  an 
acceptable  level. 

The  task  must  be 
cost  effective 

Discard 

The  item  must 
show  functional 
degradation 
characteristics  at 
an  identifiable  age, 
and  a  large 
proportion  of  units 
must  survive  to 
that  age. 

The  safe  life  limit 
must  reduce  the 
risk  of  failure  to 
assure  safe 
operation. 

The  task  must 
reduce  the  risk 
of  failure  to  an 
acceptable  level. 

An  economic  life 
limit  must  be  cost 
effective. 

Table  2.  Task  Selection  Criteria  (After  ATA  MSG-3,  page  46) 
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3,  Comparison  of  Existing  Methods 
a.  RCM 

It  is  clear  that  maintenance  activity  must  help  ensure  that  the  inherent 
levels  of  safety  and  reliability  of  the  aircraft  are  maintained. 

The  days  of  doing  maintenance  just  for  the  sake  of  maintenance  or 
because  it  makes  us  “feel  good”  are  past.  Studies  have  revealed  that 
technicians  performing  maintenance  based  on  “trivial  knowledge”  rather 
than  the  air  carrier’s  approved  maintenance  program  have  generated 
errors.  In  other  cases,  technicians  performing  approved  maintenance  that 
was  not  necessary  have  also  generated  maintenance  errors.  Each  time  we 
provide  technicians  access  to  an  aircraft,  we  also  provide  the  potential  for 
that  technician  to  inadvertently  induce  an  error.  105 

We  may  say  in  simple  words  that  the  RCM  goals  are  to: 

•  Ensure  realization  of  the  equipment’s  inherent  safety  and  reliability. 

•  Restore  equipment’s  safety  and  reliability  to  required  levels  when 
deterioration  occurs. 

•  Obtain  the  information  necessary  for  design  improvements  where  inherent 
reliability  is  insufficient. 

•  Accomplish  these  goals  at  a  minimum  total  life-cycle  cost.  106 
The  RCM  logic  is  simply  to: 

•  Determine  the  function  of  the  system/component; 

•  Eind  out  what  the  functional  failures  are; 

•  Evaluate  the  consequences  of  each  failure;  and, 

•  Assign  the  least  expensive  but  adequate  maintenance  task  to  prevent  each 

failure.  107 


105  Nakata,  Dave,  White  paper,  “Can  Safe  Aireraft  and  MSG-3  Coexist  in  an  Airline  Maintenanee 
Program?”,  Sinex  Aviation  Teehnologies,  2002,  Internet,  May  2004.  Available  at:  http://www.sinex.eom/ 
produets/Infonet/qS .  htm 

106  The  above  part  is  taken  (in  some  plaees  verbatim)  from:  Nakata. 

107  The  above  part  is  taken  (in  some  plaees  verbatim)  from:  National  Aeronauties  and  Spaee 
Administration  (NASA),  “Reliability  Centered  Maintenanee  &  Commissioning,”  slide  5,  February  16, 
2000,  Internet,  May  2004.  Available  at:  http://www.hq.nasa.gov/offiee/eodej/eodejx/Infro2.pdf 
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b.  Conducting  RCM  Analysis 

Some  managers,  who  see  RCM  as  a  quiek,  eheap  and  easy  route  to 
obtaining  the  partieular  maintenanee  polieies  they  are  seeking,  frequently  overrule  junior 
staff  taking  part  in  RCM  analysis.  This  is  a  poor  approaeh  to  the  eonduet  of  any  analysis. 
RCM  is  better  conducted  by  a  review  group,  which  may  involve  senior  staff  alongside 
more  junior  staff.  An  experienced  analyst  with  a  developed  background  in  RCM  and  in 
managing  groups  should  lead  it.  If  the  group  functioning  is  wrong,  it  is  improper  to  blame 
RCM  for  what  project  management  is  failing  to  achieve. 108 

c.  Nuclear  Industry  & 

The  initial  maintenance  programs  in  US  nuclear  power  plants  were 
developed  in  conventional  fashion,  mainly  depending  on  vendor  recommendations. 
“Continuing  efforts  to  enhance  safety  and  reliability”  resulted  in  “utility  management  at 
some  plants”  questioning  if  the  overall  outcome  was  a  “significant  degree  of  over¬ 
maintenance.”  By  the  early  80s,  the  nuclear  power  industry  seemed  to  be  “faced  with  a 
choice  of  either  generating  power  or  doing  the  prescribed  planned  maintenance  (PM).” 
They  were  seeking  a  way  to  reduce  the  PM  workloads  without  impairing  safety  or 
reliability.  This  is  the  same  type  of  question  applicable  to  SUAV  maintenance. 

The  Electric  Power  Research  Institute  (EPRI)  became  aware  of  the 
Nowlan  &  Heap  report  on  RCM  published  in  1978.  However,  after  the  initial  applications 
of  RCM,  many  plants  developed  their  own  methods  for  maintenance  optimization,  which 
deviated  from  RCM  principles.  “They  took  the  view  that  high  levels  of  redundancy  in 
their  safety  systems,  high  levels  of  regulations  imposing  failure-finding  tasks,  and  the 
fairly  simple  mission  of  the  power  generating  systems  at  such  plants  could  validly 
support  certain  simplifications  of  the  methodology.”  They  also  took  the  view  that  in  older 
plants  the  existing  experience  had  found  all  potential  failure  modes,  and  there  was  a  very 
detailed  record  keeping  conducted  by  the  nuclear  power  industry.  So  “they  felt  that  the 
function  analysis  and  the  EMEA  steps  embodied  in  the  RCM  process  could  be 
simplified.” 

108  xiie  above  part  is  taken  (in  some  plaees  verbatim)  from:  Clarke  Phill,  “Letter  to  the  Editor  of  New 
Engineer  Magazine  regarding  Professor  David  Sherwin  at  ICOMS  2000,”  question  10,  August  2000, 
Internet,  May  2004.  Available  at:  http://www.assetpartnership.eom/downloads.htm-13k 

109  The  material  from  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  Moubray,  page  3. 
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“The  most  abbreviated  approaeh,”  recommended  by  EPRI  in  TR- 105365 
in  September  1995,  “modified  the  RCM  process  by  setting  up  a  list  of  simple  functional 
questions,”  without  further  functional  analysis,  the  question  is  whether  the  component 
failure  leads  to: 

(1)  plant  trip  (shutdown), 

(2)  power  reduction  of  more  than  5%  (degradation), 

(3)  loss  of  a  safety  function, 

(4)  plant  transient  (recoverable), 

(5)  personnel  hazard,  or 

(6)  delay  in  start-up  (mission  delay)? 

“These  processes  achieved  their  limited  objectives  in  the  nuclear  industry” 
and  they  led  to  a  very  substantial  reduction  in  the  PM  workload  without  impairing  safety 
or  reliability.  Effects  (1),  (2),  (3),  and  (5)  are  noticeable  in  SUAV  operations.  Event  (4)  is 
rarely  tracked.  Event  (6)  occurs  routinely  but  rarely  recorded. 

(L  RCM  in  NAVAIR^^'^ 

“As  reported  by  US  Naval  Air  Command  (NAVAIR),  current  operation 
and  support  (O&S)  costs  for  naval  aviation  weapon  systems  consume  50  to  60  percent  of 
the  Navy’s  total  operating  account”  with  a  tendency  to  increase  every  year  by  a  rate  of  5 
percent. 

NAVAIR,  which  was  one  of  the  sponsors  of  the  original  Nowlan  &  Heap 
report,  found  that  some  vendors  were  using  all  sorts  of  unique  and  custom-made 
processes,  which  they  described  as  “RCM  processes,”  to  develop  maintenance  programs 
for  equipment  that  they  were  selling  to  NAVAIR.  “In  this  age  of  ‘do  more  with  less,’ 
there  is  a  problem  that  has  infected  the  discipline  of  physical  asset  management.  In  the 
interest  of  saving  time  and  money,  corrupted  versions  of  RCM,  versions  that 


110  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Regan,  Nancy,  RCM  Team 
Leader,  Naval  Air  Warfare  Center,  Aircraft  Division,  “US  Naval  Aviation  Implements  RCM,”  undated, 
Internet,  February  2004.  Available  at:  http://www.mt-onbne.eom/articles/0302_navalrcm.cfm 
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irresponsibly  shorten  the  process,  continue  to  flood  the  market.  These  tools  are 
incorrectly  called  RCM.” 

These  wayward  RCM  processes  led  NAVAIR  to  approach  the  Society  of 
Automobile  Engineers  (SAE)  as  a  recognized  standard-setting  institution  with  close 
relations  to  the  US  Military  and  to  the  aerospace  industry,  and  SAE  JA  1011  was 
published  in  August  1999.  It  is  a  brief  document  setting  out  all  the  minimum  criteria  that 
any  process  must  include  to  be  called  an  RCM  process  when  applied  to  any  particular 
asset  or  system.  1 1 1 

When  NAVAIR  initially  implemented  RCM  in  some  systems,  the 
economic  savings  included,  on  average: 

•  Scheduled  maintenance  reduced  by  75  percent  per  year. 

•  Consumable  usage  decreased  88  percent  per  year. 

•  Disposal  of  hazardous  material  decreased  84  percent  per  year. 

112 

e.  RCM  in  Industries  Other  Than  Aviation  and  Nuclear  Power 

RCM  has  been  applied  in  many  industrial  sites  in  many  countries.  “These 
applications  have  embodied  the  performance  of  several  thousand  RCM  analyses.”  RCM 
applications  have  not  been  successful  in  every  case.  It  can  be  said  to  have  failed  in  about 
one-third  of  the  cases.  None  of  the  initiatives  that  failed  was  due  to  technical  reasons  but 
for  organizational  ones.  The  two  most  common  reasons  for  failure  are 

(1)  The  head  internal  sponsor  of  the  effort  “quit  the  organization  or 
moved  to  a  different  position  before  the  new  ways  of  thinking  embodied  in  the  RCM 
process”  could  be  absorbed. 

(2)  The  internal  sponsor  and/or  the  consultant,  who  was  the  acting 
change  agent,  “could  not  generate  sufficient  enthusiasm  for  the  process,”  so  it  was  not 
applied  in  a  way  which  would  yield  results. 


1 1 1  The  material  from  the  above  part  of  seetion  is  taken  (in  some  plaees  verbatim)  from:  Aladon  Ltd, 
“About  RCM.” 

112  The  material  from  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  Moubray,  page  5. 
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Of  course,  the  other  two-thirds  have  been  suecessful.  There  is  “a  high 
eorrelation  between  the  sueeess  rate  of  RCM-2  (MSG-3)  applieations  and  the  change 
management  eapabilities  of  the  eonsultants  involved.”  For  example,  the  (British)  Royal 
Navy  (RN),  whieh  is  a  major  user  of  SAE-eompliant  RCM,  “has  eome  to  understand  that 
the  eapabilities  of  individual  eonsultants  are  as  important  as  the  traek  reeord  of  their 
employers.”  So  the  “RN  now  insists  on  interviewing  at  great  length  every  RCM 
eonsultant  that  is  at  their  disposal”  to  verify  the  eommereial  sineerity  of  the  employers. 

When  diseussing  RCM,  both  the  eeonomie  benefits  and  the  question  of 
risk  are  eonsiderations.  For  the  eeonomie  benefits  in  some  eases,  “the  paybaek  period  has 
been  measured  in  days  and  sometimes  one  or  two  years.”  The  normal  period  is  weeks  to 
months.  “These  eeonomie  benefits  flow  from  improved  plant  performanee”  mostly, 
although  in  some  eases  users  (espeeially  military)  have  aehieved  very  substantial 
“reductions  in  direct  maintenance  costs”. 

It  is  often  said  that  RCM  “is  a  good  tool  for  developing  maintenance 
programs  in  ‘high  risk’  situations”  and  that  “some  equipment  items  have  sueh  low  impaet 
on  business  risk  that  the  effort  required  to  perform  RCM  analysis  on  them  is  greater  than 
the  potential  benefits.”  The  truth  is  that  “no  physical  asset  or  system  ean  be  deemed  to  be 
‘low  risk’  unless  it  has  been  subjeeted  at  the  very  least  to  a  zero-based  FMECA”  that 
proves  it  is  in  fact  low  risk. 

(1)  Erom  the  results  of  thousands  of  RCM-2  (MSG-3)  analyses  that 
are  being  performed  around  the  world,  and  ineidents  in  supposedly  “low  risk,”  some 
industries  have  avoided  very  serious  business  eonsequences. 

(2)  On  average  about  4%  of  the  failure  modes  have  direet  safety  or 
environmental  implieations.  Erequently,  findings  showed  that  as  many  as  25%  of  the 
failure  modes  are  not  currently  reeeiving  any  form  of  preventive  maintenanee.  Most  of 
those  failure  modes  eoneern  protective  deviees  that  had  not  been  reeeiving  proper 
attention  prior  to  the  RCM-2  analysis. 

About  the  supposedly  “low  risk”  industries:  automobile  and  food  plants 
are  frequently  said  to  be  “low  risk,”  and  therefore  not  worth  striet  and  rigorous  analysis. 
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The  truth  is  that  you  cannot  characterize  these  industries  as  low  risk  as  the  following 
examples  indicate: 

(1)  The  boiler  that  exploded  during  a  maintenance  inspection  at 
Ford’s  River  Rouge  plant  in  Detroit  in  February  1999,  killing  six  and  shutting  the  plant 
down  for  10  days, 

(2)  The  failure  of  the  Firestone  tires  on  Ford  Explorers,  which  has 
been  charged  to  the  design,  the  operating  pressure  and  to  manufacturing  process  failures. 
These  failures  put  the  existence  of  Firestone  as  a  company  at  risk, 

(3)  The  failure  of  a  filter  used  in  the  Perrier  water  bottling  in 
France,  leading  to  the  recall  of  thousands  of  Perrier  products  and  an  enormous  cost  to  the 
company. 

Although  rare  events,  it  is  wrong  to  characterize  a  task  or  a  component  or  a 
failure  as  “low  risk,”  especially  if  all  failure  modes  had  not  being  considered. 

f.  FMEA  and 

“An  FMEA,  usually  conducted  in  the  design  phase  of  an  equipment  or 
system,  can  also  be  used  as  a  tool  for  analysis  in  RCM.”  While  defining  the  functions  and 
desired  standards  of  performance  of  an  asset,  the  objectives  of  maintenance  with  respect 
of  that  asset  are  defined.  Defining  functional  failures  enables  us  to  explain  what  we  mean 
by  “failed.”  These  two  issues  were  addressed  by  the  first  two  questions  of  the  RCM 
process.  The  next  two  questions  seek  to  identify  the  failure  modes  that  are  reasonably 
likely  to  cause  each  functional  failure  and  to  find  out  the  failure  effects  associated  with 
each  failure  mode.  This  is  done  by  performing  an  EMEA  for  each  functional  failure.  An 
EMEA  contains: 

•  Description  and  detection  for  each  failure  mode 

•  Cause  and  effects  of  each  failure 

•  Probability  of  failure  (occurrence) 

•  Criticality  of  failure  (severity) 

113  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  NASA,  slide  13. 
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•  Corrective/preventive  measures 

FMEA  is  the  key  to  a  suceessful  eommissioning  program.  For  newly 
developed  systems  with  not  mueh  experienee  gained  by  the  developing  parties, 
insufficient  oversight,  and  many  unknown  potential  circumstances,  requirements  are  not 
standard  and  certain.  Requirements  in  systems  under  development,  like  the  UAVs,  are  a 
matter  of  research,  experience  and  technology  advances. 

g.  FMECA 

Trying  to  perform  an  FMEA  to  a  new  system  under  development,  such  as 
SElAVs,  is  not  an  easy  task  because  a  lot  of  details  keep  changing.  Instead,  an  FMECA  is 
better  since  critical  issues  are  those  considered  first  priority.  FMECA  is  a  first-step  effort 
that  can  be  done  in  such  a  case. 

h.  FTA,  FMEA,  FMECA^^^ 

“The  question  to  be  addressed  when  considering  the  most  appropriate 
system  analysis  tool  is  whether  to  conduct  a  FMECA/FMEA  or  a  FTA.”  The  most 
obvious  answer  to  that  decision  making  question  is  “it  depends”.  The  “criticality  of  a 
mission  and/or  personnel  safety”  matters  are  the  primary  driving  concern  and  the  initial 
reason  for  a  FTA.  The  FTA’s  target  is  “finely  focused”  to  a  point  compared  to  that  of  a 
FMECA’s  which  is  not  focused  only  to  one  point  but  to  a  broader  area. 

“If  there  are  many  different  areas  of  concern  and  all  of  them  need  to  be 
revealed,  then  a  FMECA  is  more  effective  because  it  has  a  greater  chance  of  finding  the 
critical  failure  modes.”  If  only  a  single  event  or  a  few  events  that  can  be  clearly  defined 
are  of  crucial  concern,  then  FTA  is  favored. 

The  desire  for  either  a  qualitative  and/or  a  quantitative  analysis  is  not  the 
distinguishing  factor  for  selecting  a  FTA  or  a  FMECA/FMEA.  Either  approach  can  give 
qualitative  or  quantitative  results.  The  following  table  gives  guidance  for  choosing 
between  FTA  and  FMECA/FMEA. 


114  The  material  from  this  seetion  is  taken  (in  some  places  verbatim)  from:  Reliability  Analysis  Center 
(RAC),  Fault  Tree  Analysis  (FTA)  Application  Guide,  1990,  pages  8-10. 
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FTA  vs  FMECA  Selection  Criteria 

FTA 

Preferred 

FMECA/ 

FMEA 

Preferred 

Safety  of  personnel  or  publie  as  the  primary  eoneern 

X 

A  small  number  of  explieitly  defined  “top  events” 

X 

Inability  to  elearly  define  a  small  number  of  “top  events” 

X 

Mission  eompletion  is  of  eritieal  importanee 

X 

Any  number  of  sueeessful  missions 

X 

“All  possible”  failure  modes  are  of  eoneern 

X 

“Human  errors”  eontributions  are  of  eoneern 

X 

“Software  errors”  eontributions  are  of  eoneern 

X 

A  numerieal  “Risk  evaluation”  is  the  primary  eoneern 

X 

System  is  highly  eomplex  and  intereonneeted 

X 

System  with  linear  arehiteeture  and  little  human  or  software 
intervention 

X 

System  is  not  repairable 

X 

Table  3 .  FTA  and  FMECA/FMEA  (After  RAC  FTA,  page  1 0) 

L 

Eor  any  reliability  program,  ETA  is  an  effeetive  tool.  It  is  a  quiek  way  of 
“understanding  the  eauses  of  a  system’s  inherent  problems”  and  also  a  way  to  “identify 
potential  safety  hazards  during  the  design  phase.” 

Tailoring  the  ETA  to  fit  the  speeifie  type  of  analysis  that  is  neeessary  for  a 
eertain  seope  requires  two  deeisions.  The  seleetion  of  the  “top  event,”  whieh  is  the  target 
upon  whieh  the  ETA  is  to  foeus  is  the  first  deeision,  and  the  eoneern  of  whether  the 
analysis  is  about  to  yield  qualitative  or  quantitative  or  both  types  of  results  is  the  seeond 
deeision. 

j.  RCM  Revisited 

“RCM  is  better  in  the  operating  and  support  phase  of  the  life  eyele  of  a 
system”  This  is  true  when  eonsidering  how  and  why  RCM  was  ereated.  Eor  example, 
airplanes  were  used  from  the  beginning  of  the  previous  eentury.  The  general  eoneept  of 
the  airplane  has  been  known  for  many  years.  Legal  requirements  and  speeial  regulations 
eontrolling  manned-aviation  have  also  been  in  plaee  for  many  years.  Thus,  in  this  ease, 

1 1 5  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  RAC  FTA,  pages  9-11. 
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RCM  provided  solutions  to  certain  manned-aviation  problems  mainly  related  to 
operations  and  maintenance  issues  with  safety  and  economics  as  backgrounds.  Similarly, 
many  other  industries  also  employed  RCM  to  solve  such  problems. 

By  definition,  RCM  is  a  methodology  for  determining  the  most  cost- 
effective  maintenance  strategy  for  a  given  item  of  equipment  taking  into  account  its 
operating  environment.  When  a  product  is  in  design  phase,  the  designers  have  little 
historical  experience  so  the  whole  effort  is  focused  on  developing  something  that  works 
and  is  not  focused  on  cost-effective  strategies.  The  following  table  gives  guidance  for 
choosing  between  MSG-3  and  FMEA/FMECA. 


FMEA/FMECA  vs  MSG-3  Selection  Criteria 

MSG-3 

Preferred 

FMEA/ 

FMECA 

Preferred 

Safety  of  personnel  or  public  as  the  primary  concern 

X 

Top-down  approach  of  failure  analysis 

X 

Bottom-up  approach  of  failure  analysis 

X 

System  is  highly  complex  and  interconnected 

X 

Early  design  and  development  phase 

X 

Implementation  cost 

X 

Implementation  timescale 

X 

Economy  issues  are  of  critical  importance 

X 

“All  possible”  failure  modes  are  of  concern 

X 

X 

“Human  errors”  contributions  are  of  concern 

X 

X 

“Software  errors”  contributions  are  of  concern 

X 

Systems  with  little  human  and  a  lot  of  software  intervention 

X 

First  tool  for  initial  failure  analysis 

X 

Available  for  the  entire  system  life-cycle  (long-term) 

X 

Available  for  the  entire  system  life-cycle  (short-term) 

X 

Implementation  effort 

X 

Operational  phase 

X 

Conducted  by  experienced  personnel 

X 

Training  requirements 

X 

Extensive  and  conclusive 

X 

System  with  linear  architecture  and  little  human  or  software 
intervention 

X 

Table  4.  MSG-3  and  FMECA/FMEA 
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k.  UA  Vs,  SUA  Vs  versus  Manned  Aircraft 

The  primary  difference  between  manned  piloted  aircraft  and  UAVs  is  that 
piloted  aircrafts  rely  on  the  presence  of  humans  to  detect  (sense)  and  respond  to  changes 
in  the  vehicle’s  operation.  The  human  can  sense  the  condition  of  the  aircraft,  say  with 
unusual  vibration  that  may  indicate  structural  damage  or  impending  engine  failure. 
Humans  can  sense  events  within  and  outside  the  vehicle,  gaining  what  is  known  as 
“situational  awareness.” 

For  manned  military  aviation  the  philosophy  is  pilot  and  aircraft-oriented. 
The  valuable  life  of  the  pilot  who  spends  so  much  time  in  studies,  training,  and  gaining 
the  experience  of  hundreds  of  flight  hours,  is  the  number  one  factor.  The  expensive,  state- 
of-the-art  multi-mission-capable  aircraft  is  the  number  two  factor.  For  UAVs,  the 
philosophy  is  mission  and  cost-oriented.  Different  missions  require  different  systems, 
different  platforms  with  different  capabilities.  It  is  also  desired  that  the  cost  should 
remain  as  low  as  possible.  Technology  helps  to  achieve  both  those  goals  for  UAVs. 
Better,  cheaper  technologies  can  be  adapted  very  easily  and  very  quickly  to  UAVs. 

UAVs  can  be  remotely  piloted  (“controlled”)  from  the  ground.  It  is 
difficult  for  the  pilot  (operator)  to  feel  and  sense  having  the  same  or  better  situational 
awareness  than  if  he  was  piloted  a  manned  aircraft.  For  SUAVs,  specifically,  volume, 
weight,  cost,  duration  of  flight,  and  sensor  capabilities  are  the  primary  factors  of  interest. 
Personnel  safety  is  approached  differently  than  manned  aviation.  With  costs  starting  from 
$15K  up  to  $300K  per  platform,  SUAVs  are  considered  expendables,  but  reusables,  and 
treated  accordingly.  Thus  SUAV  reliability  is  low  since  they  are  designed  to  be 
inexpensive  and  have  a  relatively  short  life  circle. 

During  the  last  few  years,  commanders  no  longer  want  their  SUAVs  to  be 
“toys”  that  uncertainly  expand  their  capabilities.  Commanders  want  their  SUAVs  to  be 
operationally  effective  assets  to  help  win  battles.  “Operationally,  the  same  case  may  be 
made  for  ensuring  the  missions  are  completed  if  we  rely  on  UAVs  to  accomplish  mission 
critical  tasks  once  done  using  manned  assets. ”116 

116  Clough,  Bruce,  “UAVS-You  Want  Affordability  and  Capability?  Get  Autonomy!”  Air  Force 
Research  Laboratory,  2003. 
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There  are  some  facts  about  SUAV  systems  that  require  consideration: 

(1)  They  are  potentially  valuable  on  battlefields. 

(2)  Unreliability  creates  operational  ineffectiveness. 

(3)  SUAV  design  philosophy  remains  mission  and  cost  oriented. 

(4)  Software  and  hardware  reliability  improvement  is  desirable. 

(5)  Tracking  reliability  of  SUAVs  provides  insight  on  operational 
availability.  Currently,  there  is  not  any  system  to  track  SUAVs  reliability  in  use. 

(6)  Most  of  the  SUAVs  are  not  maritime  systems;  they  are  in  design  phase 
or  operational  testing. 

(7)  Sensor  and  miniaturization  technology  for  SUAVs  changes  rapidly. 

(8)  Systems  are  not  highly  complex. 

(9)  The  new  unmanned  aviation  “community”  has  started  to  develop; 
experience  operating  SUAVs  has  just  started  to  accumulate. 

(10)  Human  factors  for  the  GCS  are  critical  since  they  are  the  linkage 
between  the  system  and  its  effective  employment. 

L  Conclusions-Three  Main  Considerations  about  UA  V-  RCM 

The  reliability  tracking  and  improvement  system  for  SUAVs  must  be 
inexpensive,  easily  and  quickly  adapted,  and  implemented  by  a  few,  relatively 
inexperienced  personnel.  It  must  also  cover  the  entire  system’s  issues  of  hardware, 
software  and  human  factors.  The  safety  requirements  for  personnel  apply  only  to  the 
ground  operators  and  maintainers  and  the  main  source  of  data  for  hidden  failures  during 
flight  can  only  be  provided  by  telemetry.  Finally,  because  sensor  technology  is  rapidly 
developing  and  easily  implemented  due  to  low  cost,  the  reliability  tracking  and 
improvement  system  for  SUAVs  must  be  easily  adaptable  to  changes. 

From  the  above  we  can  construct  the  following  table  which  summarizes 
the  basic  differences  between  the  MSG-3  and  FMEA/FMECA  methods  with  respect  to 
SUAVs: 
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SUAVs 

RCM 

MSG-3 

FMEA/ 

FMECA 

1 

Reliability  improvement  needed 

X 

X 

2 

Mission  and  cost  oriented 

X 

3 

Operational  testing  and  development  phase 

X 

4 

Rapid  changes  in  technology 

X 

5 

Inexpensive  and  easily  adapted  methodology 

X 

6 

Telemetry  is  used  a  lot  (Hidden  failure  difficult  to  identify) 

X 

7 

Safety  for  operating  personnel  is  not  a  critical  issue 

X 

8 

Experienced  personnel  difficult  to  find 

X 

9 

Human  factors  for  GCS  is  critical 

X 

X 

Table  5 .  Comparing  RCM  MSG-3  and  FMEA/FMEC A  for  SUAVs. 


So,  the  main  considerations  about  RCM  implementation  for  SUAVs  are: 

(1)  Safety  has  an  important  role  in  RCM  methodology  because  of 
the  nature  of  civil  aviation.  The  primary  goal  for  civil  aviation  is  to  transport  people  and 
goods  safely.  Safety  standards  and  strict  rules  are  the  top  priority  and,  so  they  become  a 
priority  in  RCM  analysis.  Eor  industries  where  RCM  has  been  applied,  safety  has  almost 
the  same  role  as  in  the  aviation  case  because  of  strict  regulations  and  standards  for  the 
operators  and  the  employees.  In  the  UAV  case,  however,  there  are  no  people  onboard,  so 
safety  for  travelers  and  crew  is  not  as  critical  an  issue. 

(2)  In  the  RCM  process,  the  key  factor  for  the  initial  identification 
of  the  hidden  failure  is  the  flight  crew.  In  the  UAV  case,  there  is  no  crew  aboard  and  so 
there  is  no  chance  for  crew  to  sense  hidden  failures.  The  only  indication  that  might  be 
available  is  the  platforms’  control  sensors  reading  while  in-flight  and  the  system’s 
performance  while  a  platform  is  tested  on  the  ground  prior  to  take-off. 

(3)  Experience  gained  in  civil  aviation  cannot  be  applied  directly  to 

UAVs. 

Erom  the  above  it  is  clear  that  RCM  MSG-3  is  not  suitable  for 
SUAVs.  These  leaves  fault  tree  analysis  and  EMEA  as  the  remaining  methods.  We 
develop  both  in  detail  for  the  SUAV  in  the  subsequent  chapters  of  this  thesis. 


72 


B.  SMALL  UAV  RELIABILITY  MODELING 

During  recent  urban  operations  in  Iraq  and  Afghanistan,  SUAVs  that  provide 
over-the-hill  or  around-the-corner  information  were  invaluable  for  operating  teams.  Some 
systems  have  been  tested  with  very  good  results,  but  controversy  surrounds  the 
capabilities  of  such  systems.  A  generic  SUAV  system  must  provide  military  forces  with 
real-time  around  the  clock  surveillance,  target  acquisition,  and  battle  assessment.  Such  a 
system  must  be  capable  of  detecting  any  desired  tactical  information  in  a  designated 
sector. 

Each  service  component  (Navy,  Army,  and  Marines)  requires  versatile,  easy  to 
handle,  and  user-friendly  systems  that  enable  the  commander  to  conduct  reconnaissance 
on  the  battlefield  in  real-time.  SUAVs  are  being  seriously  considered  for  this  role.  This 
entails  a  small-scale  operation  over  a  city  block,  or  more  extensive  surveillance  missions. 
Requirements  of  the  system  include  locating  and  identifying  targets,  then  relaying  the 
information  to  a  higher  command.  The  detection  accuracy  should  be  sufficient  to  select 
and  to  deploy  weapons,  and  then  to  maintain  contact  after  engagement  with  such 
weapons.  The  system  must  be  able  to  survey  a  large  area  rapidly  using  multiple  platforms 
simultaneously.  The  configuration  of  the  system  should  enhance  the  fighting  capabilities 
of  the  force,  minimizing  the  time  for  precise  control  movements  and  maximizing 
mobility,  robustness  and  functionality.  Due  to  previous  experience  with  similar  systems, 
reliability  and  interoperability  are  most  important  considerations. 

1.  System’s  High  Level  Functional  Architecture 

As  illustrated  in  Figure  4,  SUAV  battlefield  systems  high-level  architecture 
consists  of  the  following: 

(1).  Platform(s) 

(a)  Navigation  with  Global  Positioning  System  (GPS)  and  Inertial 
Navigation  System  (INS) 

(b)  Flight  control  with  remote  manual,  semi-auto,  and  full-auto 
(autonomous)  mode  of  operation 
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(c)  Onboard  computer  (OBC) 

(d)  Payload  with  the  appropriate  sensors  for  the  type  of  mission 

(2).  Ground  eontrol  station  (GCS)  with  eommand,  monitor  and  support 
capabilities.!  17  This  may  be  shipboard  or  land-based. 


Figure  4.  High  Level  Arehiteeture  of  a  SUAV  System  (After  Fei-Bin) 


For  more  detailed  system  arehiteeture,  refer  to  Figure  5: 


117  Fei-Bin,  Fisiao,  and  others,  ICAS  2002,  23'^^  International  Congress  of  Aeronautical  Sciences, 
proceedings,  Toronto  Canada,  8  to  13  September,  2002,  Article:  “The  Development  of  a  Low  Cost 
Autonomous  UAV  System”,  Institute  of  Aeronautics  National  Cheng  Kung  University  Tainan,  TAIWAN 
ROC. 
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Environmental 
Considerations  in 
Field  Operation 

1.  Temperature 

2.  Humidity- 
Precipitation 

3.  Cloudiness 

4.  Lightning 

5.  Fog 

6.  Altitude 

7.  Icing  Conditions 

8.  Wind  Speed 

9.  Proximity  to  Sea 

10.  Proximity  to  Desert 

1 1 .  Proximity  to 
Inhabited  Area 


Figure  5.  Simple  Block  Diagram  of  a  SUAV  System 
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For  the  platform’s  configuration,  weight  and  volume  are  critical  factors  because 
of  the  limited  size  and  the  flight  characteristics  of  the  platform.  The  system  is  a  complex 
one  and  reliability  plays  an  important  role  for  the  operational  effectiveness  of  the  system. 
In  general,  there  are  two  ways  to  increase  reliability:  Fault  tolerance  and  fault 
avoidance.  118 

(1)  Fault  tolerance  can  be  accomplished  through  redundancy  in  hardware 
and/or  in  software.  The  disadvantage  is  that  it  increases  the  complexity  of  an  already 
complex  system,  as  well  as  increasing  equipment  costs,  volume,  weight,  and  power 
consumption. 

(2)  Fault  avoidance  can  be  accomplished  by  improving  reliability  of 
certain  components  that  constitute  the  system.  In  general,  those  components  that 
contribute  the  most  to  reliability  degradation  are  the  most  critical  for  fault  avoidance. 

The  SUAV  system  cannot  implement  fault  tolerance,  at  least  for  the  platforms,  so 
fault  avoidance  is  the  better  approach.  To  achieve  this,  at  first  we  must  conduct  an  FMEA 
in  order  to  define  and  to  identify  each  subsystem  function  and  its  associated  failure 
modes  for  each  functional  output. 

In  order  to  proceed  in  the  FMEA,  as  analysts  we  will  need  the  following: 

(1)  System  definition  and  functional  breakdown, 

(2)  Block  diagram  of  the  system, 

(3)  Theory  of  operation, 

(4)  Ground  rules  and  assumptions, 

(5)  Software  specifications. 

As  a  second  step,  we  conduct  a  criticality  analysis  in  order  to  identify  those 
mission  critical  elements  that  cause  potential  failures  and  weaknesses. 


118  Reliability  Analysis  Center  (RAC),  Reliability  Toolkit:  Commercial  Practices  Edition.  A  Practical 
Guide  for  Commercial  Products  and  Military  Systems  Under  Acquisition  Reform,  2004,  page  115. 
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To  perform  these  analyses,  we  will  use  the  qualitative  approaeh  due  to  laek  of 
failure  rate  data  and  a  laek  of  the  appropriate  level  of  detail  for  part  oonfiguration.ii9 

2,  System  Overview 

The  airborne  system  eomprises  the  aerial  platform  and  an  onboard  system.  The 
ground  system  eomprises  a  PC  and  a  modem  to  eommunicate  with  the  airborne  system. 
All  the  onboard  hardware  is  packed  in  a  suitable  model  platform  powered  by  a  1.5 
kilowatts  (kw)  aviation  fuel  (JP-5)  engine  with  a  wingspan  of  1.5  meters  (m)  and  a 
fuselage  diameter  of  12  centimeters  (cm).  The  sensor’s  payload  is  about  two  kilograms 
(kg). 

The  onboard  computing  system  is  being  developed  on  a  PC  based  single -board- 
computer.  The  onboard  computer  (OBC)  is  a  multi-tasking  real  time  operating  system. 
The  OBC  can  obtain  data  from  the  GPS,  the  INS,  the  communication  system  and  the 
onboard  flight  and  mission  sensors.  It  computes  the  flight  control  and  navigation 
algorithms,  commands  the  sensor  payload,  and  stores  and  downlinks  data  to  the  GCS  in 
near  real-time  operation. 

The  GCS  PC  is  the  equivalent  of  a  pilot’s  cockpit.  It  can  display  in  near  real  time 
the  status  of  the  flying  UAV  or  UAVs  including: 

•  UAV(s)  position  and  GCS  position 

•  Speed 

•  Altitude 

•  Course 

•  Attitude  and  system  health  in  visual  pilot-like  instruments 

•  The  actual  position  can  also  be  displayed  on  an  electronic  moving  map. 

•  Output  from  the  mission  sensors  such  as  near  real  time  imagery  displayed 
from  various  types  of  cameras  like  CCD,  infrared  (IR)  and  others. 


119  Reliability  Analysis  Center  (RAC),  Failure  Mode,  Effects  and  Criticality  Analysis  (FMECA), 
1993,  pages  9-13. 
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3,  System  Definition 


Figure  6.  Simple  Block  Functional  Diagram  of  a  SUAV  System 
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Using  the  diagram  in  Figure  6,  we  give  the  following  funetional  definitions  to 
eaeh  element  in  the  diagram. 

Platform  Structure:  The  flying  physieal  asset  responsible  for  integration  of  all 
the  neeessary  equipment  for  the  mission  profile. 

Antennas:  Responsible  for  eonducting  the  transmitted  and  reeeived  signals 
to/from  the  GCS  and  passing  them  from/to  transmitters  or  reeeivers  and  the  appropriate 
eommunieation  hardware  and  software. 

Payload,  Cameras  and  Other  Sensors:  The  actual  physical  assets  for  the  type  of 
desired  mission  consisting  mainly  of  cameras  and  other  special  sensors  like  NBC  agent 
detectors,  magnetic  disturbance,  detectors,  and  much  more. 

GPS:  The  primary  navigation  system  based  on  a  satellite  network  known  as  the 
Global  Positioning  System. 

INS:  The  support  navigation  system  based  on  the  inertial  calculations  of  current 
speed  and  course  in  order  to  provide  an  accurate  platform  fix  that  will  be  used  for  piloting 
the  platform  and  for  target  tracking. 

Engine:  The  unit  responsible  for  providing  mechanical  power  be  used  in 
conjunction  with  the  propeller  to  provide  thrust  to  the  platform. 

Battery:  The  electric  power  supply  asset  for  the  entire  platform’s  equipment 
service. 

Flight  Controls:  The  necessary  flight  sensors,  like  pittot  tubes,  hardware, 
ailerons,  elevators,  rudder,  the  relevant  servo  units  and  the  flight  controller  together  with 
the  right  software  for  manual,  semi-auto  and  autonomous  flight. 

Proper  Software:  The  necessary  software  for  platform  mission  control. 

Landing  Gear:  Responsible  for  platform  mobility  in  ground  during  takeoff  and 
landing.  Not  mandatory  for  use. 

Fnel  Tank:  Storage  of  fuel  necessary  for  engine  operation. 
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GCS:  The  manned  shipboard  or  land-based  eomponent  of  the  system  responsible 
for  eommand,  eontrol  communieation  and  support  system  eenter. 

GCS  Flight  Controls:  The  GCS  hardware  and  software  for  flight  controls. 

Sensors  Control:  Main  factor  responsible  for  mission  performance.  Manually 
operated  with  auto  capabilities. 

Screen  Output:  The  outcome  of  the  systems’  performance  presented  on  a 
monitor  with  all  the  relevant  information  for  the  mission  and  the  system. 

GCS  Antennas:  Conducts  the  transmitted  and  received  signals  to  and  from  the 
platform  and  other  centers  related  to  the  mission  and  passes  them  to/from  transmitters  or 
receivers  and  the  appropriate  communication  hardware  and  software. 

GCS  Proper  Software:  The  necessary  software  for  GCS  mission  control. 

Battery  Charger:  Charges  the  platform  battery. 

Start-up  Device:  Responsible  for  the  initial  start  up  of  the  engine’s  platform  prior 
to  takeoff. 

Spare  Parts:  Necessary  items  for  operating  and  supporting  the  system. 

Power  Supply:  Generator  and  batteries  that  provide  the  GCS  electric  power. 

Personnel:  A  pilot,  a  load/sensor  operator,  and  maintainers  who  man  the  system 
for  one  shift. 

Launching  Device:  Launches  the  platform. 

Landing  Auto  Recovery  Unit:  Provides  auto  guidance  to  the  platforms  for  auto¬ 
landings. 

4.  System  Critical  Functions  Analysis 

The  SUAV  essential  functions  analysis  can  be  seen  in  Table  6. 
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Item 

Essential  Functions 

Mission  Phases 

Stand-by 

Launch 

Cruise  to  Area 

of  Interest 

On  Station 

Cruise  Back  to 

Base 

Land 

Off  Station 

Flight 

1 

Provide  structural  integrity 

X 

X 

X 

X 

X 

X 

X 

2 

Provide  lift  and  thrust 

X 

X 

X 

X 

X 

Provide  controlled  flight 

3 

Manual  control 

X 

X 

X 

X 

X 

4 

Semi  auto 

X 

X 

X 

X 

X 

5 

Auto 

X 

X 

X 

X 

X 

6 

Navigate 

X 

X 

X 

X 

7 

Provide  power  to  control 
and  navigation  equipment 

X 

X 

X 

X 

X 

X 

8 

Withstand  environmental 
factors  (mainly  wind) 

X 

X 

X 

X 

Mission 

9 

Start  systems 

X 

10 

System’s  backup 

X 

X 

11 

Communications 

X 

X 

X 

X 

12 

Line  of  sight 

X 

X 

X 

X 

13 

Provide  power  to  sensors 
and  communications 

X 

X 

X 

X 

14 

Detect,  locate  and  identify 
targets 

X 

X 

X 

15 

Provide  data 

X 

X 

X 

16 

Provide  video  image 

X 

X 

X 

17 

Monitor  system’s  functions 

X 

X 

X 

X 

X 

X 

X 

Table  6.  System’s  Essential  Functions  Analysis 
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5.  System  Functions 

The  mission  phase  consists  of  the  following  functions: 


•  Launch  the  platform 

•  Fly  the  platform 

•  Control,  Command  and  Communicate  with  the  platform 

•  Control,  Command  and  Communicate  with  the  platform  sensors 

•  Perform  surveillance  and  reconnaissance 

•  Detect  targets 

•  Identify  targets 

•  Classify  targets 

•  Track  targets 

•  Perform  battle  assessment 

•  Know  platform’s  position 

•  Sustain  flight  mission  for  a  certain  time  at  a  certain  altitude  at  a  certain 
speed  and  on  a  certain  course 

•  Return  to  base  and  land  safely 

•  Service  the  platform  at  a  certain  time  and  set  it  ready  for  the  next  mission 

These  functions  are  the  primary  drivers  for  software  development  and  among  the 
factors  for  the  hardware  selection. 

6,  Fault  Tree  Analysis 

In  the  following  fault-tree  analysis  of  a  SUAV  system  a  top-down  analysis  has 
been  used  to  reveal  the  failure  causes.  The  sub-analyses  end  with  a  circle,  which  means 
that  further  analyses  are  needed  at  a  more  detailed  level,  or  end  with  a  diamond,  which 
means  that  the  analysis  stops  there.  Due  to  a  lack  of  data,  only  the  mechanical  engine 
failure  has  been  analyzed  at  more  than  one  level.  Using  that  analysis  we  formulate  a 
model  to  use  as  an  example  for  further  analysis. 
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7.  Loss  of  Mission 

The  first  attempt  for  the  fault-tree  analysis  should  be  the  loss  of  the  mission  tree. 


The  reasons  for  mission  loss  may  be: 

(1)  Loss  of  platform 

(2)  Loss  ofGCS 

(3)  Unable  to  loeate  platform  (loss  of  platform’s  position) 

(4)  Inappropriate  mission  for  the  sensors  (wrong  choice  of  sensors) 

(5)  Sensor(s)  failure 

(6)  Unable  to  launch  platforms  for  various  reasons,  such  as  weather  or 
launching  device  failure 

(7)  Unable  to  communicate  with  the  platform 

(8)  Loss  of  the  operator(s) 

(9)  Loss  of  the  onboard  platform’s  or  GCS’s  computer 

(10)  For  out-of-the  system  reasons,  such  as  weather  conditions  or 
situational  reasons. 

Figure  7  illustrates  the  tree  analysis  for  loss  of  mission. 
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Figure  7.  Loss  of  Mission 
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8,  Loss  of  Platform 

The  reasons  for  loss  of  platform  may  be: 

(1)  Loss  of  platform’s  structural  integrity 

(2)  Loss  of  platform’s  lift 

(3)  Loss  of  thrust 

(4)  Loss  of  platform’s  control 

(5)  Loss  ofGCS 

(6)  Loss  of  platform’s  position 

Figure  8  illustrates  the  tree  analysis  for  loss  of  platform. 
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Figure  8.  Loss  of  Platform 
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9.  LossofGCS 

The  reasons  for  loss  of  GCS  may  be: 

(1)  GCS  software  failure 

(2)  Loss  of  OBC 

(3)  Loss  of  GCS  power 

(4)  Loss  of  GCS  communication 

(5)  Loss  of  GCS  personnel 

(6)  Environmental  reasons  (e.g.  heavy  weather  conditions,  earthquake) 

(7)  Fire 

Figure  9  presents  the  tree  analysis  for  loss  of  GCS. 
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Figure  9.  Loss  of  GCS 
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10,  Loss  of  Platform’s  Structural  Integrity 

The  reasons  for  loss  of  platform’s  structural  integrity  include  fuselage,  wing,  or 
empennage  related  problems,  which  could  be  due  to: 

(1)  Fracture 

(2)  Pressure  overload 

(3)  Thermal  weakening 

(4)  Delamination  or  fiber  buckling 

(5)  Structural  connection  failure  or 

(6)  Operator  error. 

Figure  10  contains  the  fault- tree  analysis  for  loss  of  platform’s  structural  integrity. 
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Figure  10.  Loss  of  Structural  Integrity 
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11,  Loss  of  Lift 

Reasons  for  loss  of  lift  may  be: 


(1)  Loss  of  thrust 

(2)  Operator  error,  or 

(3)  Loss  of  wing  surface,  which  could  be  due  to  loss  of  right  or  left  wing 
surface,  which  in  turn  could  be  due  to: 

(a)  Fracture  removal 

(b)  Pressure  overload 

(c)  Thermal  weakening 

(d)  Delamination  or  fiber  buckling 

(e)  Structural  connection  failure  or 

(f)  Operator  error 

Figure  1 1  shows  the  fault-tree  analysis  for  loss  of  lift. 
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Figure  11.  Loss  of  Lift 
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12,  Loss  of  Thrust 

Reasons  for  loss  of  thrust  may  be: 

(1)  Loss  of  engine  control 

(2)  Operator  error 

(3)  Loss  of  propeller  that  could  be  due  to: 

(a)  Propeller  structural  failure 

(b)  Propeller  disconnection 

(c)  Operator  error 

(4)  Loss  of  engine,  which  could  be  due  to: 

(a)  Engine  failure 

(b)  Engine  stalling,  which  could  be  due  to: 

((1))  Eailure  of  fuel  system 
((2))  Operator  error 
((3))  Air  filter  failure 
((4))  Air  filter  clogged 
((5))  Engine  control  failure 
Eigure  12  shows  the  tree  analysis  for  loss  of  thrust. 
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Failure 

Failure/ 

V  Clogged  J 

Figure  12.  Loss  of  Thrust 
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13,  Loss  of  Platform  Control 

Reasons  for  loss  of  control  may  be: 

(1)  Loss  of  lift 

(2)  Loss  of  control  channel 

(3)  Loss  of  power,  which  could  be  due  to: 

(a)  Total  loss  of  platform’s  power 

(b)  Loss  of  control  unit  power 

(4)  Loss  of  aileron  forces  that  could  be  due  to: 

(a)  Loss  of  left  wing  aileron  force  that  could  be  due  to: 

((1))  Loss  of  onboard  computer  (OBC) 

((2))  Disruption  of  control  cables 
((3))  Loss  of  servo  unit 
((4))  Loss  of  aileron  surface 

(b)  Loss  of  right-wing  aileron  force  for  the  same  as  the  left-wing 

aileron  reasons 

(5)  Loss  of  rudder  force  for  the  same  as  the  left-wing  aileron  reasons 

(6)  Loss  of  elevator  force  for  the  same  as  the  left-wing  aileron  reasons 
Figure  13  illustrates  the  tree  analysis  for  loss  of  platform’s  control. 
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Figure  13.  Loss  of  Platform’s  Control 
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14,  Loss  of  Platform  Position 

Reasons  for  loss  of  platform  position  may  be: 

(1)  Loss  of  line  of  sight  (LOS) 

(2)  Loss  of  INS  backup 

(3)  Loss  of  GPS  unit 

(4)  Loss  of  GPS  antenna 

(5)  Loss  of  GPS  signal 

(6)  Platform  failure  to  transmit 

Figure  14  shows  the  tree  analysis  for  loss  of  platform’s  position: 


Figure  14.  Loss  of  Platform’s  Position 


Platform  Failure 
to  Transmit 
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15.  Loss  of  Control  Channel 

The  reasons  for  loss-of-control  channel  may  be: 

(1)  Operator  or  pilot  control  panel  failure 

(2)  Loss  of  LOS 

(3)  Failure  of  control  receiver 

(4)  Failure  of  GCS  control  transmitter 

(5)  Loss  of  power,  which  could  be  due  to: 

(a)  Loss  of  platform’s  power 

(b)  Loss  of  GCS  power 

(6)  Loss  of  platform  control  antenna,  which  could  be  due  to: 

(a)  Antenna  disconnection 

(b)  Short-circuit  in  antenna 

(c)  Antenna  failure 

(d)  Structural  damage 

(7)  Loss  of  GCS  control  antenna  the  same  as  reasons  for  loss  of  platform 
control  antenna 

Figure  15  illustrates  the  tree  analysis  for  loss  of  control  channel. 
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Figure  15.  Loss  of  Control  Channel 
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16,  Engine  Control  Failure 

Engine  control  failure  may  be  caused  by: 

(1)  Disruption  of  control  cables 

(2)  Loss  of  OBC 

(3)  Loss  of  LOS 

(4)  Loss  of  servo  unit 

(5)  Carburetor  failure 

(6)  Engine  failure 

The  fault-tree  analysis  for  engine  control  failure  can  be  seen  in  Ligure  16. 


Ligure  16.  Engine  Control  Lailure 
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17,  Engine  Failure 

The  reasons  for  engine  failure  may  be: 

(1)  Mechanical  engine  failure 

(2)  Excessive  engine  vibration 

(3)  Fuel/air  improper  mixture 

(4)  Improper  fuel 

(5)  Engine  fire 

(6)  Eoss  of  lubrication,  which  could  be  due  to: 

(a)  Gas  and  lubricant  improper  mixture 

(b)  Excessive  engine  temperature  rise 

(c)  Improper  lubricant 

The  fault-tree  analysis  for  engine  failure  can  be  seen  in  Figure  17. 
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Figure  17.  Engine  Failure 
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18,  Failure  of  Fuel  System 

The  reasons  for  fuel  system  failure  may  be: 

(1)  Failure  of  engine  fuel  system,  which  could  be  due  to: 

(a)  Fuel  pump  line  failure 

(b)  Fuel  pump  failure 

(c)  Fire 

(d)  Penetration  of  fuel  lines 

(e)  Carburetor  failure 

(2)  Loss  of  fuel  supply,  which  could  be  due  to: 

(a)  Fuel  tank  lines  failure 

(b)  Fire  and/or  explosion 

(c)  Fuel  depletion 

(d)  Penetration  of  fuel  lines 

(e)  Penetration  of  fuel  tank 

(f)  Hydrodynamic  ram 

The  fault- tree  analysis  for  fuel  system  failure  can  be  seen  in  Figure  18. 
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Figure  18.  Fuel  System  Failure 
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19,  Loss  of  Platform  Power 

The  reasons  for  loss  of  platform  power  may  be; 

(1)  Wiring  short-circuit 

(2)  Fuse  failure  that  could  be  due  to: 

(a)  Circuit  problem 

(b)  Improper  fuse 

(3)  Battery  failure  that  could  be  due  to: 

(a)  Battery  discharge 

(b)  Improper  battery 

(c)  Battery  disconnection 

(d)  Battery  short-circuit 

(e)  Battery  exhaustion 

(f)  Battery  not  fully  charged 

Figure  19  illustrates  the  fault-tree  analysis  for  loss  of  platform  power. 
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Fuse  Failure 


Figure  19.  Loss  of  Platform  Power 
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20,  Loss  of  GCS  Power 

Reasons  for  loss  of  GCS  power  may  be: 


(1)  Wiring  short-circuit 

(2)  Fuse  failure  that  could  be  due  to: 

(a)  Circuit  problem 

(b)  Improper  fuse 

(3)  Main  and  auxiliary  power  failure 

(4)  Power  disconnection 

(5)  Loss  of  GCS  generator 

Figure  20  shows  the  fault- tree  analysis  for  loss  of  GCS  power. 
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Figure  20.  Loss  of  GCS  Power 
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21,  Operator  Error 

Reasons  for  operator  error  may  be: 


(1)  Inadequate  personnel  training 

(2)  Personnel  fatigue 

(3)  Personnel  frustration  and  lack  of  experience 

(4)  Inadequate  man  machine  interface 

(5)  Operator’s  wrong  reaction  to  failure 

(6)  Misjudgment  due  to  environmental  reasons  (mainly  weather) 

(7)  Poor  documentation  of  procedures 

(8)  Poor  workload  balance  resulting  in  task  saturation  with  resulting  loss 
of  situational  awareness 

(9)  Ergonomics  (Human  factors)  of  GCS 

Figure  21  illustrates  the  fault- tree  analysis  for  operator  error. 
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Figure  2 1 .  Operator  Error 
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22,  Mechanical  Engine  Failure 

Reasons  for  mechanical  engine  failure  may  be: 


(1)  Bad  material  of  engine  parts: 

(a)  Engine  block 

(b)  Cylinder  head 

(c)  Connecting  rod(s) 

(d)  Piston(s) 

(e)  Piston  rings 

(f)  Bearings 

(g)  Crankshaft 

(2)  Normal  engine  wear 

(3)  Bad  manufacture  of  engine  parts 

(4)  Bad  design  of  the  whole  engine  or  engine  parts 

(5)  Insufficient  or  bad  maintenance 

(6)  Carburetor  failure 

(7)  Inappropriate  engine  operation 

(8)  Overheating 

(9)  Crash  damage,  which  is  due  to  operator’s  error 

(10)  Engine  vibrations. 

Eigure  22  shows  the  fault-tree  analysis  for  mechanical  engine  failure. 
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Figure  22.  Mechanical  Engine  Failure 
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23,  Engine  Vibrations 

Reasons  for  engine  vibrations  may  be; 

(1)  Broken  piston 

(2)  Bearing  failure 

(3)  Broken  piston  rings 

(4)  Bad  manufaeture  of  engine  parts  like; 

(a)  Cylinder  head 

(b)  Conneeting  rod(s) 

(c)  Piston(s) 

(d)  Piston  rings 

(e)  Bearings 

(f)  Crankshaft 

(5)  Bad  design  of  the  whole  engine  or  engine  parts 

(6)  Improper  engine  mounting 

(7)  Lack  of  propeller  balancing 

Figure  23  shows  the  fault-tree  analysis  for  engine  vibrations. 
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24,  Overheating 

Reasons  for  engine  overheating  may  be: 

(1)  Broken  piston  rings 

(2)  Bearing  failure 

(3)  Bad  manufaeture  of  engine  parts  like: 

(a)  Cylinder  head 

(b)  Conneeting  rod(s) 

(c)  Piston(s) 

(d)  Piston  rings 

(e)  Bearings 

(f)  Crankshaft 

(4)  Bad  design  of  the  whole  engine  or  engine  parts 

(5)  Dirty  cooling  surfaces 

(6)  Bad  lubricant 

(7)  Engine  operating  too  fast  due  to: 

(a)  Improper  propeller  size 

(b)  Improper  engine  adjustments 

(c)  Inappropriate  fuel 

(8)  Bad  material  of  engine  parts 

Figure  24  illustrates  the  fault-tree  analysis  for  engine  overheating. 
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Figure  24.  Overheating 
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25.  Inappropriate  Engine  Operation 

Reasons  for  inappropriate  engine  operation  may  be; 


(1)  Improper  engine  adjustment,  mounting,  disassembly 

(2)  Inappropriate  fuel/lubrieant  mixture 

(3)  Improper  propeller  size 

(4)  Inappropriate  fuel  and/or  lubrieant 

(5)  Engine  stall  (during  flight) 

(6)  Bad  earburetor  adjustments 

(7)  Inappropriate  engine  cleaning  and/or  storage  after  flights 

(8)  Inappropriate  lean  runs  (starting  after  a  long  period  of  storage  without 
any  precautions)  such  as  rusted  bearings,  seized  connecting  rod  or  piston,  dry  piston  rings 

(9)  Propeller  stops  abruptly  (due  to  external  reason)  while  turning. 

The  fault-tree  analysis  for  inappropriate  engine  operation  can  be  seen  in  Figure 
25. 
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26,  Follow-on  Analysis  for  the  Model 

The  occurrence  of  the  top  event  is  due  to  different  combinations  of  basic  events. 
A  fault  tree  provides  useful  information  about  these  combinations.  In  this  approach,  we 
introduce  the  concept  of  the  “cut  set.”  A  cut  set  is  “a  set  of  basic  events”  whose 
occurrences  result  in  the  top  event.  A  cut  set  is  said  to  be  a  “minimal  cut  set”  if  any  basic 
event  is  removed  from  the  set  and  the  remaining  events  no  longer  form  a  cut  set.  120 

For  example,  Figure  26  shows  that  the  set  {1,  2,  3,  and  4}  is  a  cut  set  because  if 
all  of  the  four  basic  events  occur,  then  the  top  event  occurs. 


Figure  26.  Example  for  Cut  Set.  (After  Kececioglu,  page  223) 


120  Kececioglu,  D.,  Reliability  Engineering  Handbook  Volume  2,  Prentice  Hall  Inc.,  1991,  page  222. 
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This  is  not  the  minimal  cut  set,  however,  because  if  the  basic  event  1  or  basic 
event  2  is  removed  from  this  set,  the  remaining  basic  events  {1,3  and  4}  and  {2,  3  and  4} 
still  form  cut  sets.  These  two  sets  are  the  minimal  cut  sets  in  that  example. 

In  the  SUAV  case,  there  is  an  absence  of  AND  gates.  Only  OR  gates  are  present. 
For  example,  trying  to  find  the  minimal  cuts  for  engine  failure,  gates  in  the  following 
diagrams  are  involved: 

(1)  Engine  failure  diagram  (El) 

(2)  Mechanical  engine  failure  (E3) 

(3)  Engine  vibrations  (E5) 

(4)  Operator  error  (E5) 

(5)  Overheating  (E4) 

(6)  Inappropriate  engine  operation  (E6) 

Naming  the  gates  GI,  G2,  up  to  G8,we  number  each  basic  event  related  to  each  of 
the  gates.  Eor  example,  in  the  engine  failure  diagram  we  have  gate  Gl  with  the  following 
basic  events: 

(1)  Mechanical  engine  failure  that  corresponds  to  gate  G2  in  Diagram  E3. 

(2)  IGl,  engine  fire 

(3)  2G1,  improper  fuel 

(4)  3G1,  fuel/air  improper  mixture 

(5)  4G1,  excessive  engine  vibrations 

(6)  G3,  the  gate  that  corresponds  to  loss  of  lubrication 

(a)  1G3,  improper  lubricant 

(b)  2G3,  excessive  engine  temperature  raises 

(c)  3G3,  gas/lubricant  improper  mixture 

Working  in  the  same  way  we  end  up  with  the  diagram  in  Eigure  27. 
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Figure  27.  Engine  Failure  Combined  Diagram 
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According  to  the  MOCUS  algorithm,  which  generates  the  minimal  eut  sets  for  a 
fault  tree  in  whieh  only  AND  and  OR  gates  exist,  an  OR  gate  inereases  the  number  of  cut 
sets  while  an  AND  gate  inereases  the  size  of  a  out  set.121  MOCUS  “algorithm  is  best 
explained  by  an  example. ”122  In  the  following  paragraph,  the  steps  of  MOCUS  algorithm 
were  followed  to  determine  the  minimal  cut  sets. 

Looating  the  uppermost  gate,  which  is  the  OR  gate  Gl,  we  replaoe  the  G1  gate 
with  a  vertioal  arrangement  of  the  inputs  to  that  gate.  Were  it  an  AND  type,  then  we 
should  have  replaced  it  with  a  horizontal  arrangement  of  the  inputs  to  that  gate. 
Continuing  in  the  next  level  to  looate  the  gates,  and  replaoing  them  in  the  above- 
presoribed  way  yields  Table  7. 


121  Kececioglu,  page  222. 

122  Hoyland,  page  88. 
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1 

2 

3 

4 

5 

Last 

G1 

IGl 

IGl 

IGl 

IGl 

2G1 

2G1 

2G1 

3G1 

3G1 

3G1 

4G1 

4G1 

4G1 

4G1 

1G2 

G2 

1G2 

1G2 

G3 

2G2 

2G2 

6G2 

3G2 

3G2 

1G3 

4G2 

4G2 

5G2 

5G2 

6G2 

6G2 

G4 

1G4 

G5 

G6 

7G4 

G7 

1G5 

1G3 

2G3 

9G5 

3G3 

1G6 

7G6 

G8 

1G7 

11G7 

1G3 

2G3 

3G3 

3G8 

Table  7.  Cut  Set  Analysis.  (After  Kececioglu,  page  229) 


In  the  last  column  of  table  7,  we  have  the  set  of  minimal  cuts  for  the  engine 
failure,  which  is  ({1G1},{2G1},{3G1},{4G1},{1G2},{2G2},...,{1G8},{2G8},{3G8}). 
The  reason  for  the  set  of  one  element  sets  is  the  OR  gates  and  the  absence  of  AND  gates. 

An  equivalent  approach  for  the  MOCUS  algorithm  starts  from  the  lowermost 
gates.  It  replaces  an  OR  gate  with  the  union  (+)  sign  and  a  AND  gate  with  the  intersection 
(*)  sign,  and  after  all  the  expressions  are  obtained,  it  continues  the  procedure  to  the  gates 
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one  step  above  from  the  lowermost  gates.  It  continues  in  this  way  until  the  expression  for 
the  top  event  is  obtained.  1 23 

Following  this  algorithm,  we  have  to  end  up  with  the  same  result  as  the  MOCUS 
algorithm,  given  as  an  expression  of  intersections  and  unions.  In  our  case,  we  end  up 
with;  EI=  IGI+2GI+3GI+4GI+IG2+2G2....+IG8+2G8+3G8.  The  equivalent  to  that 
expression  diagram  is  given  in  Figure  28. 


Figure  28.  Equivalent  Diagram 


123  Kececioglu,  page  230. 
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Trying  to  find  the  equivalent  representation  to  a  bloek  diagram,  we  end  up  with  a 
“chain  like”  representation  that  can  be  seen  in  Figure  29.  A  fault-tree  representation  of  a 
system  can  be  converted  into  a  block-diagram  representation  by  replacing  the  AND  gates 
with  parallel  boxes  and  the  OR  gates  with  boxes  in  series.  124 


Figure  29.  Equivalent  Block  Diagram 


In  a  series  structure,  the  component  with  the  lowest  reliability  is  the  most 
important  one.  We  can  compare  that  with  a  chain.  A  chain  is  never  stronger  than  its 
weakest  link.  So  the  most  important  element  for  reliability  improvement  is  the  one  with 
the  lowest  reliability.  125  Reliability  for  a  series  system  can  be  also  explained  by  the  use  of 
Structural  Functions,  which  is  summarized  in  Appendix  D. 

27,  Criticality  Analysis 

For  the  criticality  matrix,  we  need  a  metric  for  the  severity-of-failure  effect,  so  we 
can  use  the  designations  in  Table  8. 


Description 

Classification 

Mishap  definition 

Catastrophic 

I 

System  or  platform  loss 

Critical 

II 

Major  system  damage 

Marginal 

III 

Minor  system  damage 

Minor 

IV 

Less  than  minor  system 
damage. 

Table  8.  Classification  of  Failures  According  To  Severity  (After  RAC  FMECA,  page 

26) 


124  Blischke,  page  220. 

125  Hoyland,  page  197. 
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Due  to  the  absence  of  historic  lack  of  data,  it  is  appropriate  to  use  a  qualitative 
approach  for  the  classification  of  failures  according  to  their  occurrence  number  which  is 
the  overall  probability  of  failure  during  the  item  operating  time  internal,  as  illustrated  in 
Table  9.126 


Level 

Occurrence 

Description 

Occurrence 

number 

A 

Erequent 

High  probability  of 
occurrence 

>0.20 

B 

Reasonably  probable 

Moderate  probability 
of  occurrence 

>0.10  and 
<0.20 

C 

Occasional 

Occasional 
probability  of 
occurrence 

>0.01  and 
<0.10 

D 

Remote 

Unlikely  probability 
of  occurrence 

>0.001  and 
<0.01 

E 

Extremely  Unlikely 

Essentially  zero 

<0.001 

Table  9.  Classification  of  Failures  According  To  Occurrence 


From  our  previous  analysis  for  engine  failure  using  FTA,  we  ended  up  with  the 
following  reasons: 

a.  Excessive  engine  vibrations 

b.  Fuel/air  improper  mixture 

c.  Improper  fuel 

d.  Engine  fire 

e.  Gas  and  lubricant  improper  mixture 
f  Excessive  engine  temperature  rise 

g.  Improper  lubricant 

h.  Inadequate  personnel  training 

126  RAC  FMECA,  page  60. 
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i.  Personnel  fatigue 

g.  Operator’s  frustration  and  laek  of  experienee 

k.  Inadequate  man  maehine  interfaee 

l.  Operator’s  wrong  reaetion  to  failure 

m.  Environmental  reasons 

n.  Misjudgment  due  to  environmental  reasons  (mainly  weather) 

0.  Poor  doeumentation  of  proeedures 

p.  Poor  workload  balanee  resulting  in  task  saturation  with  resulting  loss  of 
situational  awareness 

q.  Ergonomies  (Human  faetors)  of  GCS 

r.  Bad  material 

s.  Normal  engine  wear 

t.  Bad  manufaeture 

u.  Bad  design 

V.  Insuffieient  maintenanee 
w.  Carburetor  failure 
X.  Broken  piston 

y.  Bearing  failure 

z.  Improper  engine  mounting 
aa.  Eaek  of  propeller  balaneing 

bb.  Broken  piston  rings 
ee.  Bearing  failure 

dd.  Dirty  eooling  areas 
ee.  Improper  propeller  size 
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ff.  Improper  engine  adjustments 
gg.  Broken  piston  rings 
hh.  Engine  stalls  (during  flight) 
ii.  Bad  earburetor  adjustments 

jj.  Inappropriate  engine  eleaning  and/or  storage  after  flights 

kk.  Inappropriate  lean  runs  sueh  as  rusted  bearings,  seized  eonneeting  rod 

or  piston 

11.  Propeller  stops  while  turning. 

From  the  above,  we  ean  derive  the  following  issues  about  an  engine  failure 
eritieality  analysis,  initially  based  on  our  own  experienee  and  judgment  due  to  laek  of 
traeking  by  eurrent  operators: 


128 


Number 

Issue 

ID 

Probability  of 
occurrence 

Severity  of 
failure  effect 

1 

Excessive  engine  vibrations 

El 

D 

II 

2 

Engine  fire 

L2 

D 

I 

3 

Euel  type 

L3 

D 

III 

4 

Lubricant  type 

L4 

D 

III 

5 

Euel/air  mixture  adjustment 

L5 

C 

III 

6 

Gas  and  lubricant  mixture 

L6 

D 

III 

7 

Personnel  training 

PI 

C 

II 

8 

Operator’s  frustration 

P2 

C 

II 

9 

Personnel  experience 

P3 

B 

III 

10 

Poor  documentation  of  procedures 

P4 

C 

II 

11 

Poor  workload  balance 

P5 

C 

II 

12 

Ergonomics  of  GCS 

P6 

C 

II 

13 

Misjudgment 

P7 

B 

II 

14 

Environmental  reasons 

P8 

C 

II 

15 

Man  machine  interface 

P9 

D 

III 

16 

Maintenance 

PIO 

D 

II 

17 

Engine  adjustments 

Pll 

C 

III 

18 

Usage 

P12 

B 

II 

19 

Manufacture 

P13 

D 

III 

20 

Software  failure 

S 

D 

II 

21 

Material 

Ml 

D 

I 

22 

Hardware  failure 

M2 

E 

III 

23 

Design 

M3 

D 

II 

24 

Engine  wear 

M4 

D 

II 

25 

Carburetor 

M5 

C 

II 

26 

Piston 

M6 

E 

II 

27 

Bearing 

M7 

C 

I 

28 

Piston  rings 

M8 

E 

I 

29 

Propeller  size 

PR 

E 

II 

30 

Engine  temperature 

T1 

D 

II 

31 

Cooling  areas 

T2 

D 

II 

Table  10.  Qualitative  Occurrence  and  Severity  Table 
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Our  next  step  is  to  construet  the  eriticality  matrix  based  on  the  previous 
qualitative  analysis  table; 


A 

A 

t 

E  B 
(/) 

CD 

0 

L— 

O 

c 

P12 

0 

O 

^  C 

o 

o 

o 

P11 

4— 

o 

1=  D 

CD 

JD 

O 

CL 

L3 

/  L6 
^L4  P9 

T2  M4  L1 

P10 

S  M3  T1 

L2 

M1 

E 

M2 

M6 

PR 

M8 

IV 

III 

II 

I 

Severity  classification  (increasing  — ►  ) 


Figure  30.  Engine  Failure  Criticality  Matrix.  (After  RAC  FMECA,  page  33) 

“The  criticality  matrix  provides  a  visual  representation  of  the  critical  areas”  of  our 
engine  failure  analysis.  127  Items  in  the  upper  most  right  corner  of  the  matrix  require  the 
most  immediate  action  and  attention  because  they  have  a  high  probability  of  occurrence 
and  a  catastrophic  or  critical  effect  on  severity.  Diagonally  toward  the  lower  left  corner  of 
the  matrix,  criticality  and  severity  decreases.  In  case  the  same  severity  and  criticality, 


127  RAC  FMECA,  pages  33-34. 
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exists  for  different  terms,  safety  and  eost  are  the  driving  faetors  of  the  analysis.  For 
SUAVs,  we  do  not  take  safety  under  great  eonsideration  beeause  we  are  dealing  with 
unmanned  systems,  but  we  do  have  to  eonsider  eost. 

Table  1 1  shows  the  results  from  our  analysis: 


Number 

Issue 

ID 

Probability  of 
Occurrence 

Severity  of 
Failure 
Effect 

1 

Misjudgment 

P7 

B 

II 

2 

Usage 

P12 

B 

II 

3 

Bearing 

M7 

C 

I 

4 

Personnel  training 

PI 

C 

II 

5 

Operator’s  frustration 

P2 

C 

II 

6 

Personnel  experience 

P3 

B 

III 

7 

Poor  documentation  of  procedures 

P4 

C 

II 

8 

Poor  workload  balance 

P5 

C 

II 

9 

Ergonomics  of  GCS 

P6 

C 

II 

10 

Environmental  reasons 

P8 

C 

II 

11 

Carburetor 

M5 

C 

II 

12 

Euel/air  mixture  adjustment 

L5 

C 

III 

13 

Engine  adjustments 

Pll 

C 

III 

Table  1 1 .  Results  from  Engine  Failure  Criticality  Analysis.  The  most  critical  issues  are 

highlighted. 


28,  Interpretation  of  Results 

From  the  above,  it  is  obvious  how  important  the  human  factor  is.  The  way  the 
user  operates  the  system:  the  ability  to  make  the  right  decisions,  frustration,  training, 
experience,  poor  workload  balance  among  the  operators  and  poor  documentation  of 
procedures  are  among  the  most  critical  factors  for  our  engine  failure  mode.  The  way  the 
user  maintains  the  system,  also  related  to  training  and  experience,  the  ability  to  adjust  the 
engine  and  the  fuel-air  mixture  properly  are  also  among  the  critical  contributors  for 
engine  failure  mode. 

The  importance  of  the  bearing  and  carburetor  are  clearly  shown.  Those  two  parts 
are  the  most  critical  among  all  the  parts  composing  the  engine,  according  to  our  analysis. 
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Finally,  environmental  reasons  eonelude  the  most  eritical  of  the  issues  that  eould 
result  in  an  engine  failure. 
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III.  DATA  COLLECTIONS  SYSTEMS 


A,  RELIABILITY  GROWTH  AND  CONTINUOUS  IMPROVEMENT 

PROCESS 

SUAVs  do  not  have  a  FRACAS  system.  In  this  section  of  the  thesis  we  construct 
one.  The  FRACAS  system  is  addressed  to  the  Program  Manager  of  any  SUAV  type 
during  the  phase  of  design,  development  or  operation. 

1,  Failure  Reporting  and  Corrective  Action  System  (FRACAS)i28 

“The  basic  measure  of  FRACAS  effectiveness  is  its  ability  to  function  as  a 
closed-loop  coordinated  system”  in  identifying  and  repairing  product  and/or  process 
failure  modes,  and  identifying,  implementing  and  verifying  a  corrective  action  to  prevent 
repetition  of  the  failure.  “As  a  result,  early  elimination  of  causes  of  failure  or  trends,” 
greatly  improves  reliability. 

At  each  stage  of  product  development,  the  closed-loop  FRACAS  should  collect 
and  evaluate  information  for  each  failure  incident,  as  shown  in  Figure  3 1 . 


128  The  material  for  this  section  is  taken  (in  some  places  verbatim)  from:  RAC  Toolkit,  pages  284-289, 
and:  National  Aeronautics  and  Space  Administration  (NASA),  “Preferred  Reliability  Practices:  Problem 
Reporting  and  Corrective  Action  System  (PRACAS),”  practice  NO.  PD-ED-1255,  Internet,  February  2004. 
Available  at:  http://klabs.org/DEI/References/design_guidelines/design_series/1255ksc  .pdf 
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12 


Figure  3 1 .  Closed-loop  for  FRACAS  (After  NASA,  FRACAS,  page  2) 


In  order  to  conduct  FRACAS,  we  need  to  follow  a  FRACAS  flow  and  evaluation 
checklist: 


a.  Failure  Observation 

In  the  first  step,  we  identify  that  a  failure  incident  has  occurred  and  we 
notify  all  required  personnel  about  the  failure. 
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b.  Failure  Documentation 

We  record  all  relevant  data  describing  the  conditions  in  which  the  failure 
has  occurred.  A  detailed  description  of  the  failure  incident  as  well  as  supporting  data  and 
equipment  operating  hours  is  needed. 

c.  Failure  Verification 

If  the  failure  is  permanent,  then  we  verify  the  incident  by  performing  tests 
for  failure  identification.  If  the  failure  is  not  permanent,  then  we  verify  the  incident  by 
uncovering  the  conditions  in  which  the  failure  has  occurred.  Finally,  if  the  failure  cannot 
be  verified,  we  pay  close  attention  to  the  reoccurrence  of  failure. 

d.  Failure  Isolation 

For  failures  that  were  verified,  we  perform  testing  and  troubleshooting  to 
isolate  their  causes.  Isolating  failure  can  identify  a  defective  part  or  parts  of  the  system, 
or  it  can  relate  the  incident  to  other  reasons,  like  operator’s  error,  test  equipment  failure, 
improper  procedures,  lack  of  personnel  training,  etc. 

e.  Replacement  of  Problematic  Part(s) 

For  the  above  failures,  we  replace  the  problematic  part  or  parts  with  a 
known  good  one  and  replicate  the  conditions  under  which  the  failure  has  occurred.  By 
testing,  we  confirm  that  the  current  part  (or  parts)  has  been  replaced.  If  failure  reappears 
we  repeat  failure  isolation  in  order  to  determine  the  cause  of  failure  correctly.  We  have  to 
tag  the  replaced  part  or  parts,  including  all  relevant  documentation  and  data. 

f.  Problematic  Part(s)  Verification 

We  have  to  verify  the  problematic  part(s)  independent  of  the  system.  If  the 
failure  cannot  be  confirmed  then  we  have  to  review  failure  verification  and  isolation  to 
determine  the  right  failure  part(s).  The  isolation  of  the  failure  to  the  lowest  possible  level 
of  the  system’s  decomposition  is  the  key  to  reveal  the  root  failure  cause. 

g.  Data  Search 

In  this  step,  it  is  necessary  to  look  up  historical  databases  and  reports  for 
similar  or  identical  failure  documentations.  Databases  could  be  from  the  implementation 
of  FRACAS  methodology  itself  or  could  be  from  a  FMEA  or  other  technical  reports. 
Failure  tendencies  or  patterns,  if  any,  must  be  evaluated  because  they  may  reveal 
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defective  lots  of  parts,  or  bad  design,  or  bad  manufacturing,  or  even  bad  usage.  This  is 
obviously  absent  for  SUAV  systems. 

h.  Failure  Analysis 

A  failure  analysis  to  determine  the  root  failure  cause  follows  next.  The 
depth  and  the  extension  of  failure  analysis  depend  on  the  criticality  of  the  mission,  the 
system’s  reliability  impact  and  the  related  cost.  The  outcome  of  failure  analysis  should  be 
specify  failure  causes  and  identify  any  external  causes. 

L  Root-Cause  Analysis 

This  answers  the  question,  “what  could  have  been  done  to  prevent 
failure?”  It  focuses  more  on  the  true  nature  of  failure,  which  could  be  due  to: 

•  Overstress  conditions 

•  Design  error 

•  Manufacturing  defect 

•  Unfavorable  environmental  conditions 

•  Operator  or  procedural  error,  etc 

j.  Determine  Corrective  Action 

In  this  phase,  we  have  to  develop  a  corrective  action.  We  have  to  rely  on 

the  failure  analysis  and  root-cause  analysis  results  and  our  solution  should  prevent 
reappearance  of  the  failure  in  the  long  term  in  order  to  be  effective.  Corrective  actions 
could  be: 

•  System  redesign 

•  Part(s)  redesign 

•  Selection  of  different  parts  or  suppliers 

•  Improvements  in  processes 

•  Improvements  in  manufacturing  etc 

k.  Incorporate  Corrective  Action  and  Operational  Performance  Test 

Now,  we  can  incorporate  the  identified  corrective  action  in  the  failed 

system  and  perform  initial  baseline  tests  as  a  start  in  order  to  verily  the  desired 
performance.  After  the  first  successful  results,  our  tests  should  become  operational  tests 
including  conditions  under  which  the  failure  had  occurred.  After  the  documentation  of  all 

test  results,  we  can  compare  the  pre-failure  test  results  to  identify  alterations  in  baseline 
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data.  Testing  should  be  suffieient  enough  to  give  us  the  eonfidenee  level  that  the  original 
failure  mode  has  been  sueeessfully  eliminated  from  reoeeurring.  For  large-seale 
ineorporation  of  a  eorreetive  aetion,  verifying  the  aetion  is  first  needed  to  avoid 
unneeessary  delays  and  expenses. 

I  Determine  Effectiveness  of  Corrective  Action 

We  have  to  verily  that  our  eorreetive  aetion: 

•  Has  suoeessfully  eorreeted  the  failure 

•  Has  not  ereated  or  indueed  other  failures 

•  Has  not  degraded  performanee  below  aeeeptable  levels 

If  the  original  failure  reoeeurs,  we  have  to  repeat  the  FRACAS  proeess 
from  the  beginning  to  determine  the  eorreet  root  eause. 

m.  Incorporate  Corrective  Action  into  All  Systems 

After  verifying  our  eorreetive  aetion  in  one  system,  we  can  implement  our 
solution  to  all  similar  systems.  We  have  to  keep  the  FRACAS  procedure  running  in  order 
to  track,  document,  report,  and  determine  the  correct  root  cause  and  the  corrective  action 
necessary  for  all  failure  modes  that  appear.  Corrective  actions  involve  changes  to 
procedures,  alterations  to  processes  and  personnel  training,  so  tracking  is  necessary  to 
assure  that  the  new  versions  were  implemented  correctly  and  not  confused  with  old  ones. 

2.  FRACAS  Basics 

Basically,  the  system  must  provide  exact  information  on: 

a.  What  was  the  failure? 

b.  How  did  the  failure  occur? 

c.  Why  did  the  failure  occur? 

(1)  Was  it  an  equipment  or  part  design  error? 

(2)  Was  it  an  equipment  or  part  manufacturer  workmanship  error? 

(3)  Was  it  a  software  error? 

(4)  Was  it  a  test  operator  error? 

(5)  Was  it  a  test  procedure  or  equipment  error? 

(6)  Was  it  an  induced  failure? 
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d.  How  can  we  prevent  sueh  failures  from  reoeeurring? 

From  all  the  above  we  ean  simplify  the  proeedure  to  the  next  eheeklist  shown  in 
Figures  32  and  33. 


Figure  32.  FRACAS  Methodology  Cheeklist  page  1/2 
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Repeat  FRACAS 


From 
page  1/2 


Figure  33.  FRACAS  Methodology  Checklist  page  2/2 
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3.  FRACAS  Forms 

I  have  developed  forms  to  implement  the  FRACAS  methodology  for  SUAVs: 


a.  Failure  report  as  shown  in  Table  12  and  13 

b.  Failure  analysis  report  as  shown  in  Table  14129 

c.  Corrective  action  verification  report  as  shown  in  Table  15 

d.  Tag  to  problematic  part  as  shown  in  Table  16. 

e.  Failure  Log-Sheet  as  shown  in  Table  17. 

During  recent  operations  experimenting  with  the  Surveillance  and  Tactical 
Acquisition  Network  (STAN)  project  at  Camp  Roberts,  observers  identified  reliability 
and  operational  availability  issues  for  SUAVs.  As  a  result,  I  developed  these  forms  for 
use  during  upcoming  operations  with  the  XPV  1-B  TERN  SUAV  system.  130 

These  forms  were  presented  to  a  VC-6  team  for  use  during  the  STAN  experiment 
during  May  2004.  The  effort  to  implement  these  forms  was  not  successful.  The  primary 
reason  was  lack  of  personnel  training  related  to  the  FRACAS  system  itself,  the  form 
filling,  and  the  general  concept  of  reliability.  The  secondary  reason  was  lack  of 
coordination  and  control  to  fill  these  forms.  It  was  obvious  that  a  member  of  the 
operating  team,  assigned  with  the  extra  task  to  coordinate  and  control  the  proper  data 
entry  for  the  forms,  was  needed. 

“It  is  preferable  to  attempt  to  communicate  the  ‘big  picture,’  so  that  each  team 
member  is  sensitive  to  failure  detection”  and  identification,  and  “the  appropriate 
corrective  action  process.”  I3i  Nevertheless,  it  is  typical  especially  in  military 
applications  to  have  overall  control,  so  a  centralized  FRACAS  administration  within  a 
team  or  teams  is  needed. 

The  forms  cover  all  aspects  of  SUAV  design,  development,  production  and 
operation  with  emphasis  to  experienced  operation  or  test  teams.  All  forms  are  addressed 


129  RAC  Toolkit,  page  290. 

130  Gottfried. 

131  RAC  Toolkit,  page  284. 
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to  the  operating  or  test  team.  The  Failure  Analysis  Report  form  is  also  addressed  to  the 
design  and  development  team. 

Using  the  forms  we  ean  colleet  information  and  data  to  the  level  of  detail 
necessary  to  identify  design  and/or  process  deficiencies  that  should  eliminated,  preferably 
before  the  SUAV  released  to  its  users  in  the  battlefield.  For  that  reason  the  forms  can  be 
used  for  other  systems  as  well  as  SUAVs. 

The  characteristics  of  the  forms  are: 

a.  Simple  and  easy  to  implement  even  by  one  or  two  persons 

b.  Brief  in  meanings  and  implementation  time  (time  oriented) 

c.  Suitable  for  cheap  systems  like  SUAV  (cost  oriented) 

d.  Focused  on  elimination  of  fault  reoccurrence 

e.  Generates  data  collection  that  can  be  used  as  data-base  source 

There  are  no  known  forms  of  FRACAS  or  any  other  reliability  tracking  system 
that  have  been  used  for  SUAV  testing  or  operation  in  the  past. 

4,  Discussion  for  the  Forms  Terms 

Most  of  the  terms  in  those  forms  are  self-explanatory.  Some  discussion  follows 
for  some  of  them. 

a.  For  the  Initial  Failure  Report  in  Table  12: 

(1)  Total  Operating  Hours,  in  position  (8),  is  the  cumulative 
operation  hours  for  the  SUAV  system 

(2)  Current  Mission  Hours,  in  position  (9),  is  the  operation  hours 
from  the  beginning  of  the  current  mission  that  the  fault  was  been  detected 

(3)  Description  of  Failure,  in  position  (15),  is  the  full  description  of 

the  observed  failure. 

(4)  Supporting  data,  in  position  (16)  all  available  telemetry  data 
related  to  the  failure  time  must  be  listed.  In  position  (1 6a)  environmental 
Parameters/Conditions,  is  the  list  of  environmental  conditions  and  parameters  available 
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like  temperature,  wind  speed,  humidity/preeipitation,  eloudiness,  lightning,  and  fog,  ieing 
eondition,  proximity  to  sea  or  desert  or  inhabited  area.  In  position  (16b)  System 
Parameters/Conditions,  is  a  list  of  system  eonditions  and  parameters  like,  flight  altitude, 
platform  speed,  engine  RPM,  fuel  level,  battery  status,  eommunieation  status,  LOS 
availability. 

(5)  Aetions  for  Failure  Verifieation,  in  position  (17),  is  a 
deseription  of  the  operators’  aetions  that  verily  the  failure. 

(6)  Affeeted  Subsystems,  in  positions  (18)  to  (22)  and  (23)  to  (27) 
are  referenees  for  the  effeeted  subsystems  of  the  SUAV  system,  during  the  failure 
ineidenee. 

(7)  System  Condition  after  Failure,  in  position  (29)  is  a  deseription 
of  the  system  general  eondition  after  the  failure.  For  example,  “platform  erashed  due  to 
loss  of  eontrol.” 

b.  For  the  Failure  Report  eontinued  in  Table  13: 

(1)  Problematie  Parts  Reeognized,  in  position  (16)  is  a  list  of  the 
affeeted  parts  that  have  been  reeognized  after  the  failure. 

(2)  Problematie  Parts  Replaeed,  in  position  (17)  is  a  list  of  the  parts 
that  have  been  replaeed  after  the  failure  in  an  effort  to  isolate  the  failure  eause. 

(3)  Root  Failure  Cause,  in  position  (18)  is  an  estimate  or  the 
outeome  of  the  previous  efforts  to  isolate  the  failure  eause. 

(4)  Previous  Similar  or  Same  Cases  (if  any),  in  position  (29)  is  a 
referenee  to  similar  or  same  failure  eases  based  on  historieal  data  or  other  aeeurate 
sourees. 

(5)  Baekground,  in  position  (30)  is  ah  baekground  information 
related  to  the  failure.  For  example  it  eould  be  an  explanation  of  a  sensor  subsystem,  or  a 
software  funetion. 

e.  For  the  Failure  Analysis  Report  in  Table  14: 
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(1)  History,  in  position  (31)  is  a  complete  deseription  of  the 
observed  failure  and  all  the  events  that  followed. 

(2)  Analysis,  in  position  (32)  is  the  failure  analysis  based  on  data, 
drawings  and  blueprints,  manuals,  and  opinion  from  experts,  designers,  and  operators. 

(3)  Conelusions,  in  position  (33)  are  the  analysis  outeome. 

(4)  Correetive  Aetion/Reeommendation,  in  position  (34)  is  the 
result  of  that  form.  This  is  the  reeommended  solution  to  the  problem. 

d.  For  the  Correetive  Aetion  Verifieation  Report: 

(1)  Operating  Hours  after  Previous  Failure,  in  position  (13),  is  the 
eumulative  operation  hours  after  the  previous  failure  whieh  resulted  in  correetive  aetion 
taken  for  the  SUAV  system. 

(2)  Tests  for  Correetive  Aetion  Verifieation  Made,  in  position  (17), 
is  a  referenee  to  all  tests  that  have  been  made  to  verily  that  the  reeommended  solution  is 
eorrect. 

(3)  Alterations  from  Baseline  Data,  in  position  (22),  is  a  list  of  all 
alteration  from  the  initial  data  readings,  after  the  implementation  of  the  reeommended 
solution. 

(4)  Correetive  Aetion  Taken,  in  position  (21),  is  a  statement  about 
the  eorreetive  aetions  that  have  been  taken  in  order  to  solve  the  problem. 

e.  For  the  Failure  Log-Sheet: 

All  entries  in  the  Log-Sheet,  like  Date,  Time,  Number,  Initial 
Report  Number  and  Failure  Deseription,  must  be  eonsistent  to  the  relevant  entries  in  the 
other  forms.  In  that  way  we  ean  easily  traek  the  failure  eases  when  is  needed. 

In  all  forms,  exeept  the  Log-Sheet  form,  there  is  a  term  for  Comments.  It  eovers 
any  other  detail  that  the  operator  or  the  tester  estimates  that  is  relevant  to  the  failure  and 
warrants  mention. 
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3.  Project  ID 


Form  type:  2.  Page  1  of 

Initial  Failure  Report 


4.  System  5.  Serial  6.  Detected  7.  Failure  S.Total  9.  Current 

No  During  Date,  Time  Operating  Hours  Mission  Hours 


10.  Reported  by 


1 1 .  Verified  by 


12.  System  13.  Type  of  System’s  14.  Type  of  Failure 

Operated  by:  Mission  (permanent/recoverable) 


15.  Description  of  Failure 


16.  Supporting  Data: 

a.  Environmental  Parameters/Conditions 


b.  System  Parameters/Conditions 


17.  Actions  for  Failure  Verification 


18.  Name 


19.  Reference  Drawings  20.  Part  No  2 1 .  Manufacturer  22. Serial  No 


3  ^  - 

^  ^  23.  Name 

^  $ 

Co  _ 

28.  Quick  Failure  Assessment  (if  any) 


24.  Reference  Drawings  25.  Part  No  26.  Manufacturer  27. Serial  No 


29.  System  Condition  after  Failure 


30.  Comments 


3 1 .  Prepared  by 


32  Date  33  Checked  (reliability) 


34.Date  35.ProblemNo 
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l.No 

Form  type: 

Failure  Report  (Continued) 

2.  Page  1  of 

3.  Project  ID 

4.  System 

5.  Serial 
No 

6.  Detected 

During 

7.  Failure  Date, 
Time 

8.  Total 

Operating  Hours 

9.  Current 
Mission  Hours 

10.  Reported  by 

1 1 .  Verified  by 

12.  System 
Operated  by: 

1 3 .  Type  of  System’ s 
Mission 

14.  Number  of  Failure 

15.  Description  of  Failure  (brief) 


16.  Problematic  Parts  Recognized: 


17.  Problematic  Parts  Replaced: 


18.  Root  Failure  Cause 


19.  Name 


20.  Reference  Drawings 


21.  Part  No 


22.  Manufacturer 


23. Serial  No 


K 

I 

g 


24.  Tagged 
by: 


25.  Failure  Verified  26.  Failure  Verified 

by(reliability)  :  (engineering) : 


by 


27.  Failure  Verified 
(program) : 


by 


28.  System  Condition 
after  Replacement: 


29.  Previous  Similar  or  Same  Cases  (if  any) 


30.  Background 


3 1 .  Comments 


321.  Prepared  by 

33.  Date 

34.  Checked  (reliability) 

35.  Date 

36.  Problem  No 

37.  Checked  (engineering) 

38.  Date 

39.  Checked  (program) 

40.  Date 

41.  Distribution 

Table  13.  Failure  Report  Continuation  Form 


145 


l.No 


Form  type: 


2.  Page  1  of 


Failure  Analysis  Report 


3.  Project  ID  4.  System  5.  Seria 

No 

1  6.  Test  Level  7. 

Failure  Date 

8.  Operating  Hours 

9.  Reported 
by 

MAJOR  10.  Name 

COMPONENT  OR 

UNIT 

11.  Reference  12 

Drawings 

.  Part  No 

13.  Manufacturer 

14.Serial  No 

15.  Name 

16.  Reference  Drawings 

17.  Part  No 

18.  Manufacturer 

19.Serial  No 

s 

§ 

20.  Name 

2 1 .  Reference  Drawings 

22.  Part  No 

23.  Manufacturer 

24.Serial  No 

PART(S) 

25.  Name 

26.  Reference  Drawings 

27.  Part  No 

28.  Manufacturer 

29.Serial  No 

30.  Related  MRs  and  PINs 

31.  History 


32.  Analysis 


33.  Conclusions 


34.  Corrective  Action/Recommendation 


35.  Corrective  Action  by 

36.  Document  No 

37.  Corrective  Action 
Effectiveness 

38. Prepared  by 

39.  Date 

40.  Approval  (reliability) 

4 1  .Date 

42.  Problem  No 

43 .  Approval  (engineering) 

44.  Date 

45.  Approval  (program) 

46.  .Date 

47.  Distribution 

Table  14.  Failure  Analysis  Report  Form  (From  RAC  Toolkit,  page  290) 
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l.No 

Form  type: 

Corrective  Action  Verification  F 

leport 

2.  Page  1  of 

3.  Project  ID 

4.  System 

5.  Serial 

No 

6.  Test  Level 

7.  Failure  Date 

8.  Total  Operating 
Hours 

9.  Reported 
by 

10.  Initial  Failure 

Report  form  Number 

11.  Failure  Report 
Continue  form  Number 

1 1 .  Failure 
Analysis 

Report 

12.  Current 
Mission  Hours 
Before  Failure 

13.  Operation 

Hours  after 

Previous  Failure 

14.Number  of 
Corrective 
Action  Taken 

16.  Related  Drawings,  Documents,  Other  Data 


17.  Tests  for  Corrective  Action  Verification  Made 


18.  Test  Conditions 

a.  Environmental  Conditions 


b.  System  Condition 


19.  Test  Results 


20.  Alterations  from  Baseline  Data. 


21.  Corrective  Action  Taken 


22.  Comments 


23.  Corrective  Action  Taken  by 

24.  Date 

25.  Document  No 

26.  Corrective  Action 
Effectiveness 

27.  Prepared  by 

28.  Date 

29.  Approval  (reliability) 

30.  Date 

31.  Problem  No 

32.  Approval  (engineering) 

33.  Date 

34.  Approval  (program) 

35.  Date 

36.  Distribution 

Table  15.  Correction  Action  Verification  Report  Form 
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l.No 

Form  type: 

Tag  to  Problematic  Part 

2.  Page  1  of 

3.  Project  ID 

4.  System 

5.  Serial 

No 

6.  Detected 
During 

7.  Failure  Date 

8.  System’s  Total 
Operating  Hours 

9.  Reported 
by 

10.  Initial  Failure 

Report  form  Number 

11.  Failure  Report 
Continue  form  Number 

12.  Failure 
Analysis 

Report 

13.  Corrective 

Action  Verification 
Report 

14.  Operation 
Hours  after 
Previous 

Failure 

15.  Total 
Number  of 
Failures. 

16.  Failure  Description 


17.  Failure  Relevant  Documentation 


18.  History 


19.  Name 


20.  Reference  drawings 


21.  Part  No 


22.  Manufacturer 


23. Serial  No 


s 

I 


24.  Tagged 
by: 


29.  Comments 


25.  Failure  Verified  26.  Failure  Verified 

by(reliability)  :  (engineering) : 


by 


27.  Failure  Verified 
(program): 


by 


28.  System 
Condition  after 
Replacement: 


30.  Verified  by 

3 1 .  Date 

32.  Document  No 

33.  Corrective  Action 
Effectiveness 

34.  Prepared  by 

35.  Date 

36.  Approval  (reliability) 

37.  Date 

38.  Problem  No 

39.  Approval  (engineering) 

40.  Date 

41.  Approval  (program) 

42.  Date 

43.  Distribution 

Table  16.  Tag  to  Problematic  Part  Form 
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Form  type: 

Failure  Log-Sheet 

1 .  Number 

2.  Date 

3.  Time 

4.  Operator 

5.  Failure  Description  (brief) 

6.  Reported? 

7.  Initial 

Report 

Number 

8.  Initials 

9.  Checked  by 

10.  Date 

1 1 .  Mission  Description 

Table  17.  Failure  Log-Sheet 

Use  of  these  forms  will  allow  detailed  analysis  of  the  eauses  of  failure  and 
detailed  modeling  of  reliability  by  subsequent  analysts. 
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5.  Reliability  Growth  Testing  132 

It  is  almost  certain  that  prototypes  or  new  designs  will  not  initially  meet  their 
reliability  goals.  Implementation  of  a  reliability  enhancement  methodology  sueh  as 
FRACAS  is  the  only  way  to  overcome  the  initial  problems  that  may  surface  in  the  first 
prototype  performance  tests  and  later.  Therefore,  failures  are  identified,  and  aetions  taken 
to  correct  them.  As  the  procedure  eontinues,  corrective  actions  become  less  frequent. 
After  a  reasonable  amount  of  time,  one  must  check  whether  reliability  has  improved,  and 
estimate  how  mueh  additional  testing  is  needed. 

Duane  observed  that  there  is  a  relationship  between  the  total  operation  time  (J) 
aeeumulated  on  a  prototype  or  new  design  and  the  number  of  failures  {n{T))  sinee  the 
beginning  of  operation.  133  If  we  plot  the  cumulative  failure  rate  (or  cumulative  mean 
time  between  failures  MTBFe  )  n{T)/T  versus  T  in  a  log-log  sealed  graph,  the  observed 
data  tends  to  be  a  linear  regardless  of  the  type  of  equipment  under  consideration. 

Duane’s  plots  provide  a  rough  estimate  of  the  increment  of  the  time  between 
failures.  It  is  expected  that  time  between  failures  at  the  early  stages  of  development  will 
be  short.  But  soon  after  the  first  corrective  actions  they  will  gradually  beeome  longer.  As 
a  eonsequence  Duane’s  plots  will  show  a  rapid  reliability  improvement  in  the  early  stages 
of  development.  After  the  first  corrective  actions  the  reliability  improvement  would  be 
less  rapid.  After  a  eorrective  action  we  can  see  whether  there  is  a  reliability  improvement 
or  not.  So  we  can  have  a  measure  of  effectiveness  of  our  eorrective  aetions,  whieh 
eorresponds  to  the  growth  of  reliability. 


132  The  material  for  this  seetion  is  taken  (in  some  places  verbatim)  from:  Lewis,  E.  E.,  Introduction  to 
Reliability  Engineering,  Second  Edition,  John  Wiley  &  Sons,  1996,  pages  211-212. 

133  Duane,  J.  J.,  “Learning  Curve  Approach  to  Reliability  Modeling,”  Institute  of  Electrical  and 
Electronic  Engineers  Transactions  on  Aerospace  and  Electronic  Systems  (IEEE.  Trans.  Aerospace)  2563, 
1964. 
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Figure  34.  Duane’s  Data  Plotted  on  a  Log-log  Scale. 

Figure  33  illustrates  a  Duane’s  data  plot  for  a  hypothetical  system.  Because  of  the 
straight  line  we  get:  \n\n{T)IT^  =  a-\Yi{T)  +  b  and  then: 

gin[«{n/n  ^  ^  ^  ^  ^  ^  ^  ^nd  so 

finally  we  have  n{T)  =  K  ■  .  Alpha  (a)  is  the  growth  rate  or  the  change  in  MTBF  per 

time  interval  over  which  change  occurred  and  A  is  a  constant  related  with  the  initial 
MTBF. 

a.  If  a=0,  there  is  no  improvement  in  reliability  because  the  straight  line  is 
parallel  to  the  cumulative  operating  hours  axis,  which  means  that  there  is  no  change  in 
the  cumulative  failure  rate. 

b.  If  a<0,  then  the  cumulative  failure  rate  decreases,  and  the  expected 
failures  become  less  frequent  as  T  increases.  Therefore  reliability  increases. 
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c.  If  a=-\,  n(T)  =  K  =  e’  =  constant .  Therefore  the  number  of  failures  is 
independent  of  time  T.  We  ean  assume  that  a=-\  is  the  theoretieal  upper  limit  for 
reliability  growth. 

d.  If  a>0,  then  the  eumulative  failure  rate  inereases,  and  the  expected 
failures  beeome  more  frequent  as  T  inereases.  Therefore  reliability  deereases. 

From  n{T)  =  K  we  have:  n{T)/T  =  K-T‘'  whieh  is  the  reeiproeal  of 

eumulative  MTBF.  And  so  the  testing  time  required  to  aehieve  a  given  failure  rate 

(MTBF),  is  {K -MTBF)^  . 

6.  Reliability  Growth  Testing  Implementation 

In  order  to  implement  the  above-mentioned  methodology,  we  may  eonsider  the 
system  as  an  entity  and  as  a  set  of  entities.  In  the  first  case,  we  just  eount  all  systems 
failures  and  the  operational  hours  related  to  eaeh  failure.  In  the  seeond  case,  we  may 
eonsider  that  the  system  is  the  eomposition  of: 

a.  Propulsion  and  power 

b.  Flight  eontrol  and  navigation 

c.  Communieation  and  sensors 

d.  GCS  (Human  in  the  loop) 

e.  Miseellaneous. 

Eaeh  failure  ean  be  assigned  to  one  of  the  above  eategories  and  therefore  we  have 
to  keep  traek  of  five  different  reliability  tendeneies. 

B,  RELIABILITY  IMPROVEMENT  PROCESS 
1,  UAVs  Considerations 

For  a  reliability  improvement  proeess  application  in  SUAVs  we  ean  eonsider  the 
following: 

a.  There  is  no  officially  accepted  future  system  eoneept  of  operations  for 

SUAVs. 
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b.  There  are  many  classified  and  unclassified  reports  published  on  many 
different  types  of  SUAVs. 

c.  Many  systems  have  been  tested  and  there  are  plans  for  future  tests  in 
battlefield  environments  and  in  deployments  with  the  fleet. 

(1) .  The  EWASP  SUAV  system.i34 

(2) .  The  XPV-IB  TERNVKY  systemi35 

(3) .  The  Sea  ALL  (Sea  Airborne  Lead  Line)  SUAV  system  which  is 
a  variety  of  the  USMC  Dragon  Eye  UAV.136 

d.  There  is  a  real  operational  need  for  SUAVs  during  deployments  of  the 
fleet.  For  example,  due  to  an  urgent  requirement  to  maintain  a  continuous  recognized 
maritime  picture  of  the  Carrier  Strike  Group  vital  area,  small  UAVs  are  needed  to  assist 
the  limited  existing  maritime  patrol  aircrafts.  For  that  reason,  a  request  for  the  SUAV 
Archangel  to  be  used  onboard  USS  Enterprise  CSG  has  been  released.i3V 

e.  There  is  a  real  problem  regarding  the  reliability  of  those  systems.  UAVs 
in  general  have  roughly  up  to  100  times  the  failure  rate  of  manned  aircrafts,  and  SUAVs 
are  even  more  failure  prone  than  larger  ones.  The  US  Office  of  the  Secretary  of  Defense’s 
UAV  Roadmap,  which  was  released  in  May  2003,  recommends  that  more  research  be 
made  into  low  Reynolds-number  flight  regimes,  investigations  be  carried  out  for 
enhancing  UAV  reliability  and  therefore  availability.  It  also  recommends  the 
incorporation  and  development  of  all-weather  practices  into  UAV  designs.  138 


134  Morris  Jefferson,  Aerospace  Daily,  Deeember  8,  2003,  “Navy  To  Use  Wasp  Miero  Air  Vehiele  To 
Conduet  Littoral  Surveillanee.” 

135  Message  from  COMMMNAVAIRSYSCOM  to  HQ  USSOCOM  MACDILL  AFB  FL,  Mareh  26, 
2004,  “UAV  Interim  Flight  Clearanee  for  XPV-IB  TERN  UAV  System,  Land  Based  Coneept  of  Operation 
Flights.” 

136  Sullivan  Carol,  Kellogg  James,  Peddieord  Erie,  Naval  Researeh  Lab,  January  2002,  Draft  of 
“Initial  Sea  All  Shipboard  Experimentation.” 

137  Undated  message  from  Commander,  Cruiser  Destroyer  Group  12  to  Commander,  Seeond  Fleet, 
“Urgent  Requirement  for  UAVs  in  Support  of  Enterprise  Battle  Group  Reeognized  Maritime  Pieture.” 

138  UAV  Rolling  News,  “UAV  Roadmap  defines  reliability  objeetives,”  Mareh  18,  2003,  Internet, 
February  2004.  Available  at:  http://www.uavworld.eom/_disel/0000002 
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2.  UAVs  and  Reliability 

The  U.S.  military  UAV  fleet  (consisting  of  Pioneers,  Hunters,  and  Predators) 
reached  100,000  cumulative  flight  hours  in  2002.  This  milestone  is  a  good  point  at  which 
to  assess  the  reliability  of  these  UAVs.  Reliability  is  an  important  measure  of 
effectiveness  for  achieving  routine  airspace  access,  reducing  acquisition  system  cost,  and 
improving  UAVs  mission  effectiveness.  UAV  reliability  is  important  because  it  supports 
their  affordability,  availability,  and  acceptance.  139 

UAV  reliability  is  closely  tied  to  their  affordability  primarily  because  UAVs  are 
expected  to  be  less  expensive  than  manned  aircraft  with  similar  capabilities.  Savings  are 
based  on  the  smaller  size  of  the  UAVs  and  the  omission  of  pilot  or  aircrew  systems. 

a.  Pilot  Not  on  Boardl 4 0 

With  the  removal  of  the  pilot  and  the  tendency  to  produce  a  cheaper  UAV, 
redundancy  was  minimized  and  component  quality  was  degraded.  Yet  UAVs  became 
more  prone  to  in-flight  loss  and  more  dependent  on  maintenance.  Therefore,  their 
reliability  and  mission  availability  were  decreased  significantly.  Being  unmanned,  they 
cannot  provide  flight  cues  to  the  user  such  as: 

•  Acceleration  sensation, 

•  Vibration  response, 

•  Buffet  response, 

•  Control  stick  force  feedback, 

•  Any  higher  longitudinal,  directional  and  lateral  control  sensitivities. 

•  Direct  feeling  of  the  failure,  in  general. 

Ground  testing  and  instrumentation  data  analysis  are  the  only  source  for 

such  cues. 

b.  Weather  Considerations^'*^ 

139  OSD  2002,  Appendix  J,  page  186. 

140  xiie  material  for  this  section  is  taken  from:  Williams  Warren,  Michael  Harris,  “The  Challenges  of 
Flight  -Testing  Unmanned  Air  Vehicles,”  Systems  Engineering,  Test  &  Evaluation  Conference,  Sydney, 
Australia,  October  2002. 
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Experience  has  shown  that  the  most  important  operational  consideration 
for  flight  is  the  weather,  regardless  of  other  technical  characteristic,  such  as  engine  type, 
power  or  wingspan.  Meteorological  conditions  affect  both  the  platform  and  the  GCS. 
Factors  include  winds,  turbulence,  cold  temperatures  at  designated  altitudes,  icing,  rain, 
fog,  low  cloudiness,  humidity  in  general  and  lightning  strikes.  Meteorological  conditions 
affect  the  GCS  include  extreme  ambient  temperatures,  icing,  rain,  fog,  low  cloudiness, 
humidity  and  lightning  strikes.  These  considerations  can  be  mitigated  because  of  the 
relaxed  constraints  of  ground  units  compared  to  the  restricted  constraints  for  small  aerial 
units. 

For  the  platform  the  most  important  weather  condition  is  wind  speed  and 
direction  at  surface  (the  lowest  100  meters  of  the  atmosphere)  and  upper  levels.  Other 
weather  conditions  are  important  but  do  not  affect  the  flight  unless  they  are  extreme. 
Surface  winds  affect  air-platforms  during  takeoff  and  landings,  but  also  during  preflight 
and  post  flight  ground  handling.  Fight  winds  are  most  favorable  for  routine  operation  and 
testing.  High  winds  during  flight  can  cause  significant  platform  drift,  which  results  in 
poor  platform  position  controllability.  This  can  render  a  mission  profile  infeasible  and 
result  in  flight  cancellation. 

Prior  to  deploying  any  UAV  system,  a  study  must  be  made  of  the 
prevailing  meteorological  conditions.  If  conditions  are  extreme  (such  as  very  high  winds, 
extreme  cold,  or  high  altitude),  then  the  FlAV  system  may  not  be  mission  capable,  and  a 
different  asset  may  be  better  suited.  Alternate  FlAVs  or  manned  systems  should  be 
considered  in  this  case. 


141  Teets,  Edward  H.,  Casey  J.  Donohue,  Ken  Underwood,  and  Jeffrey  E.  Bauer,  National  Aeronauties 
and  Spaee  Administration  (NASA),  NASA/TM-1998-206541,  “Atmospherie  Considerations  for  UAV 
Flight  Test  Planning,”  January  1998,  Internet,  February  2004.  Available  at: 
http ://www. dfre .nasa. gov/DTRS  /1998/PDF/H-2220.pdf 
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c.  Gusts  and  Turbulence 

The  high  susceptibility  of  the  platform  to  gusts  and  turbulence  makes 
stabilizing  flight  operation  points  very  difficult.  The  platform’s  low-wing  loading  can 
lead  to  high-power  loading  due  to  gusts,  and  turbulence  and  the  low  inertia  are  the  main 
reasons  for  that  behavior 1 42. 

During  the  development  test  and  evaluation  period  (DT&E),  an  SUAV  can 
be  tested  in  aerodynamic/wind  tunnels  to  establish  its  general  flight  characteristics.  A 
basic  flight  manual  can  be  produced  during  DT&E  that  will  be  tested  and  refined  during 
the  operational  test  and  evaluation  period  (OT&E).  The  advantage  of  SElAVs  is  that  the 
actual  airframe  can  be  tested  in  the  wind  tunnel,  without  any  analogy  or  other  factor 
involved  in  the  calculations  because  the  original  platform  (and  not  any  miniaturized 
model)  is  being  tested. 

d.  Non  Developmental  Items  (NDI)  or  Commercial  Off-the-shelf 
(COTS) 

One  of  the  factors  in  lack  of  reliability  of  inexpensive  ElAVs  is  the 
use  of  NDI/COTS  components  that  were  never  meant  for  an  aviation 
environment.  In  many  cases,  it  would  have  been  better  to  buy  the  more 
expensive  aviation-grade  components  to  begin  with  than  to  retrofit  the 
system  once  constructed.  Do  not  assume  COTS  components/systems  will 
work  for  an  application  they  were  not  designed  for.  In  other  words,  they 
have  to  be  COTS  for  that  specific  use.i43 

Using  NDI/COTS  items  may  save  money  but  require  testing  in  order  to 
ensure  compatibility  and  to  reduce  uncertainty  in  mission  efficiency.  144 

e.  Cost  Considerationsi45 

By  using  COTS  technology,  distributed  sensors,  communications 
and  navigation,  it  is  also  proposed  that  the  total  system  reliability  may  be 
increased.  It  must  be  noted  however  that  this  approach  does  not  currently 
account  for  issues  of  airworthiness  certification. 


142  NASA  1998. 

143  Clough. 

144  Hoivik,  Thomas  H.,  OA-4603  Test  and  Evaluation  Lecture  Notes,  Version  5.5,  “The  Role  of  Test 
and  Evaluation,”  presented  at  NFS,  winter  quarter  2004. 

145  The  material  for  this  section  is  taken  (in  some  places  verbatim)  from:  Munro  Cameron,  and  Fetter 
Krus,  AIAA’s  E‘  Technical  Conference  &  Workshop  on  Unmanned  Aerospace  Vehicles,  Systems, 
Technologies  and  Operations;  a  Collection  of  Technical  Fapers,  AIAA  2002-3451, “A  Design  Approach  for 
Low  cost  ‘Expendable’  UAV  system,”  undated. 
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It  is  a  fact  that  the  primary  cost  item  in  UAVs  is  not  the  vehicles  but  the 
guidance,  navigation,  control  and  sensor  packages  that  they  carry.  Typically  all  those 
technology  “miracles”  can  represent  70%  of  the  system’s  cost.  Although  sensors  continue 
to  decrease  in  cost,  size  and  power  consumption,  the  demands  for  more  capabilities  and 
mission  types  are  increasing.  As  a  result,  cost  is  increasing. 

We  can  assume  that  acquisition  cost  is  proportional  to  reliability,  and  wear 
out  is  not  proportional  to  reliability.  Then,  a  generic  reliability  trade-off  can  be  seen  as  in 
Figure  35.  We  can  conclude  that  a  highly  reliable  UAV  does  not  coincide  with  an  overall 
low  system  cost. 

Another  point  of  interest  related  to  cost  and  reliability  is  that  reliability  is 
low  for  SUAVs  because  SUAVs  are  designed  to  be  inexpensive.  This  statement  is  true 
because  reliability  is  expensive  and  one  truly  gets  what  one  pays  for.i46 


146  Clough. 
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Figure  35.  Generic  Cost  Relationship.  (After  Munro) 

f.  Man  in  the  Loop 

The  man-in-the-loop  can  be  accomplished  “through  nearly  all  of  the 
potential  controlling  equipment  available.”  UAV  control  equipment  is  the  link  between 
man  and  machine  together  with  the  data  display  mechanisms.  Controlling  equipment  can 
be  remotely  piloted,  semi-autonomous  with  a  combination  of  programmed  and  remote 
piloted,  and  fully  autonomous  (full-auto)  with  pre-flight  and/or  in-flight  programmed.! 47 

Another  point  of  interface  between  man  and  machine  is  maintenance  and 
pre-flight  and  after-flight  servicing.  Piloting  a  UAV  resembles  an  instrumented  manned 
flight.  For  that  reason  there  are  four  main  considerations: 

147  Carmichael,  Bruce  W.,  and  others,  “Strikestar  2025,”  Chapter  4,  “Developmental  Considerations, 
Man-in-the-Loop,”  August  1996,  Department  of  Defense  ,  Internet,  February  2004.  Available  at: 
http  ://www.au.afmiFau/2025/volume3/chapl3/v3c  13-4.htm 
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(1)  Collision  avoidance 

(2)  Multiple  platforms  eontrol 

(3)  Landing  (recovery) 

(4)  Loss  of  flight  eontrol  and  regain  of  it. 

g.  Collision  Avoidance 

For  UAVs,  a  system  is  needed  that  can  weigh  tasks  and  put  priorities  only 
on  the  flight  requirements  or  mission  requirements.  It  is  essential  to  have  the  eapability, 
like  the  pilot  does,  to  sense  and  to  avoid  obstacles  that  most  of  the  time  the  remote  pilot 
cannot  see.148 

If  an  accurate  eollision  avoidanee  system  were  developed,  UAVs  could 
become  more  responsive  to  the  demanding  needs  of  the  battle  commander.  1 49  “NASA, 
the  U.S  military,  and  the  aerospace  community  have  joined  forees  to  develop  detect,  see, 
and  avoid  (DSA)  technologies  for  UAVs.”i50  These  teehnologies  will  also  inerease  safety 
operations  above  residential  areas  and  allow  UAVs  to  join  the  piloted  aerial  vehicles  in 
national  airspaee. 

h.  Landing 

A  lot  of  UAV  mishaps  are  related  to  landing.  The  usual  ways  for  UAVs  to 

land  are: 

•  Using  landing  gear  on  runways  or  airstrips 

•  Using  landing  gear  and  arresting  gear  on  ship  flight-decks 

•  Making  a  calculated  erash  landing  without  using  landing  gear 

•  Recovering  in  an  arresting  net 

148  Finley,  Barfield,  Automated  Air  Collision  Avoidance  Program,  Air  Force  Research  Laboratory, 
AFRL/VACC,  WPAFB,“Autonomous  Collision  Avoidance:  the  Technical  Requirements,”  0-7803-6262- 
4/00/$10.00(c)2000  IEEE. 

149  Coker,  David,  Kuhlmann,  Geoffrey,  “Tactical-Unmanned  Aerial  Vehicle  ‘Shadow  200’ 
(T  UAV),”  Internet,  February  2004.  Available  at:  http://www.isye.gatech.edu/~tg/cources/6219/assign 
/fall2002  /TUAVRedesign/ 

150  Lopez,  Ramon,  American  Institute  of  Aeronautics  and  Astronautics  (AIAA),  “Avoiding  Collisions 
in  the  Age  of  UAVs,”  Aerospace  America,  June  2002,  Internet,  Febmary  2004.  Available  at: 
http://www.aiaa.org  /aerospace/ Article. cfm?issuetocid=223&ArchiveIssueID=27 


159 


•  Landing  in  sea  water 

•  Using  a  paraehute 

•  Vertieal  take-off-and-landing  (VTOL) 

The  most  common  problems  with  recovery  are  lack  of  experience  by  the 
remote  pilot  and  low  altitude  winds,  even  for  the  VTOL  UAVs.  To  resolve  or  mitigate 
this  problem,  automated  recovery  systems  can  be  used.  Those  systems  have  been 
developed  to  improve  precision,  ease  and  safety  of  UAV  recoveries,  on  land  and  sea,  and 
in  a  variety  of  weather  conditions.  151 

L  Losing  and  Regaining  Flight  Control 

The  need  for  uninterrupted  communication  between  the  operator  in  the 
GCS  and  the  platform  is  a  critical  capability.! 52  An  interruption  of  that  link  is  always 
possible  due  to  loss  of  Line-of-Sight  (LOS),  communication  failure  related  to  platform  or 
GCS,  and  electromagnetic  interference  (EMI).  The  only  way  to  overcome  this  problem  is 
autonomy  with  dependable  autopilot  and  mission  control  software. 153 

Autonomy  for  a  UAV  platform  is  based  on  an  onboard  computer,  which  is 
responsible  for  most  of  the  platform’s  performance  and  “behavior”.  Subprograms  for 
time-related  loss  of  communications,  regaining  communications,  points  of  regaining 
communication  efforts,  and  other  functions  related  with  mission  effectiveness  are  very 
common  among  UAV  software.  Additionally,  emission  control  applications  help  allocate 
bandwidth  for  different  uses  and  may  decrease  the  EMI  hazard.  Eor  UAVs,  which  use 
different  sensor  configurations  in  the  same  type  of  platform,  there  is  also  a  need  for 
reconfigurable  multi-mission  processing.! 54 

j.  Multiple  Platforms  Control 


!5!  UAV  Annual  Report  FY  1997:  Subsystems,  Key  subsystem  program,  “UAV  eommon  reeovery 
system  (UCARS),”  Internet,  February  2004.  Available  at:  http://www.fas.org/irp/ageney/daro/uav97 
/page36.html 

!52  Coker. 

!53  Puseov,  Johan,  “Flight  System  Implementation,”  Sommaren-Hosten  2002,  Royal  Institute  of 
Teehnology  (KTH),  Internet,  February  2004.  Available  at:  http://www.partiele.kth.se/group_does/admin 
/2002/Johan_2t.pdf 

!54  Robinson,  John,  Teehnieal  Speeialist  Mercury  Computers,  COTS  Journal,  “UAV  Multi-Mission 
Payloads  Demand  a  Flexible  Common  Processor,”  June  2003,  Internet,  February  2004.  Available  at: 
http://www.mc.eom/literature/hterature_fdes/COTSJ_UAVs_6-03.pdf 
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Demands  for  piloting  a  UAV  require  two  operators  in  general.  The  aviator 
operator  (AVO)  is  responsible  for  aviating  and  navigating,  and  the  mission  payload 
operator  (MPO),  or  Sensor  Operator  (SENSO),  is  responsible  for  target  seareh  and 
system  parameters  monitoring.  In  smaller  UAVs  there  may  be  only  one  operator  who 
does  both  tasks.  Requiring  two  operators  limits  the  number  of  operators  available  for 
other  missions.  Is  it  possible  for  those  two  operators  to  eontrol  two  or  more  platforms 
simultaneously?  155  Is  it  also  possible  for  the  single  operator  for  the  smaller  UAV  to  do 
the  same? 

The  SUAV  operators  are  part  of  a  battle  team  and  their  primary  skill  and 
training  is  to  fight  and  then  to  operate  the  SUAVs.  They  operate  SUAVs  from  a  distanee 
yet  in  the  proximity  of  the  battlefield.  So,  eare  must  be  taken  in  making  exeessive 
workload  demands  on  the  SUAV  operators.  Instead,  by  making  the  platform  eontrol  and 
operation  more  user-friendly,  we  ean  optimize  the  benefits  of  SUAVs  eapabilities.  When 
the  operators  ean  stand  far  enough  from  the  battlefield,  user-friendly  eontrol  of  SUAVs  is 
advantageous,  and  multiple  platform  eontrol  ean  beeome  a  more  realistie  eapability  if 
SUAV  autonomy  is  high. 

k.  Reliability,  A  vailability,  Maintainability  of  UA  Vs 

Reliability  is  the  probability  that  a  UAV  system  or  eomponent  will  operate 
without  failures  for  a  speeified  time  (the  mission  duration)  as  well  as  the  prefiight  tests 
duration.  This  probability  is  related  to  the  mean  time  between  failures  (MTBF)  and 
availability. 

Availability  is  defined  as  the  ability  of  a  system  to  be  ready  for  use  when 
needed  at  an  unknown  (random)  time.  It  is  the  natural  interpretation  of  reliability  of  our 
everyday  life.  Availability  is  a  funetion  of  reliability  and  maintainability. 

As  diseussed  earlier,  redundaney  plays  an  important  role  to  keep  reliability 
high.  Keeping  redundaney  at  a  high  level  inereases  system  eomplexity  and  eost,  however. 


155  Dixon,  Stephen  R.,  and  Christopher  D.  Wiekens,  “Control  of  multiple  UAVs:  A  Workload 
Analysis,”  University  of  Illinois,  Aviation  Human  Faetors  Division,  Presented  to  12*  International 
Symposium  on  Aviation  Psyehology,  Dayton,  Ohio  2003. 
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Volume,  weight,  and  eost  are  also  important  for  UAVs  system’s  operational  usage  and 
real  system  needs.  There  is  a  trade  off  as  indicated  in  Figure  36.156 
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Figure  36.  Reliability  Trade-Offs.  (After  Sakamoto,  slide  8) 

Where  redundancy  is  difficult  to  implement,  fault  avoidance  or  parts 
quality  is  the  solution  to  improve  reliability.  In  some  cases  adding  redundancy  in  critical 
subsystems,  like  navigation  aids,  is  unavoidable.  Thus,  cost  and  complexity  increases. 157 
Maintainability  is  a  system  effectiveness  concept  that  measures  the  ease  and  rapidity  with 
which  a  system  or  equipment  is  restored  to  its  operational  state  after  failing.  Reliability, 
availability,  and  maintainability  are  discussed  in  Appendix  D. 

3,  Reliability  Improvement  for  Hunter 


156  Sakamoto,  Norm,  presentation:  “UAVs,  Past  Present  and  Future,”  Naval  Postgraduate  School, 
February  26,  2004. 

157  Clough. 
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The  Army’s  acquisition  of  the  Hunter  RQ-5  system  is  an  example  of  reliability 
improvement  after  the  implementation  of  a  reliability  improvement  program.  In  1995, 
during  acceptance  testing,  three  Hunter  platforms  crashed  within  a  three  week  period.  As 
a  result,  full  rate  production  was  canceled.  The  Program  Management  Office  and  the 
prime  contractor  Thompson  Ramo  Wooldridge  (TRW)  performed  a  Failure  Mode  Effect 
and  Criticality  Analysis  (FMECA)  for  the  whole  system.  Failures  were  identified  and 
design  changes  were  made  after  failure  analyses  and  corrective  actions  were 
implemented.  As  a  result.  Hunter’s  Mean  Time  Between  Failures  (MTBF)  for  its  servo 
actuators,  which  were  the  main  cause  for  many  crashes,  increased  from  7,800  hours  to 
57,300  hours. 

Hunter  returned  to  flight  status  three  months  after  its  last  crash.  Over  the  next  two 
years,  the  system’s  MTBF  doubled  from  four  to  eight  hours  and  today  stands  close  to  20 
hours.  Prior  to  the  1995,  Hunters  mishap  rate  was  255  per  100,000  hours;  afterwards 
(1996-2001)  the  rate  was  16  per  100,000  hours.  Initially  canceled  because  of  its 
reliability  problems.  Hunter  has  become  the  standard  to  which  other  UAVs  are  compared 

in  reliability.  158 

4,  Measures  of  Performance  (MOP)  for  SUAVs 

In  manned  aviation,  the  usual  Measures  Of  Performance  (MOPs)  used  for 
reliability  tracking  are 

•  Accidents  per  100,000  hours  of  flight 

•  Accidents  per  1,000,000  miles  flown 

•  Accidents  per  100,000  departuresi59 

In  the  Vietnam  War,  the  MOPs  used  for  the  Lightning  Bug  were 

•  The  percent  of  platforms  returned  from  a  mission,  calculated  as  the 
number  of  platforms  recovered  from  similar  successful  missions  divided 


158  OSD  2002,  Appendix  J. 

159  National  Transportation  Safety  Board  (NTSB),  Aviation  Accident  Statistics, ’’Table  6.  Accidents, 
Fatalities,  and  Rates,  1984  through  2003,  for  U.S.  Air  Carriers  Operating  Under  14  CFR  121,  Scheduled 
Service  (Airline),  Internet,  April  2004.  Available  at:  http://www.ntsb.gov/aviation/Table6.htm 
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by  the  number  of  platforms  launehed  for  that  mission  in  a  eertain  time 
period. 

•  Missions  aoeomplished  per  platform  per  mission  type  in  a  eertain  time 

period.  160 

The  frequeney  of  mishaps  is  the  primary  faetor  for  ehoosing  a  MOP.  In  the  SUAV 
ease,  we  ean  use  the  following  MOPs  for  reliability  traeking: 

a.  Crash  Rate  (CR):  The  total  number  of  erashes  divided  by  the  total 
number  of  flight  hours.  A  erash  results  in  loss  of  platform. 

b.  Operational  CR:  The  total  number  of  erashes  divided  by  the  total 
number  of  operating  flight  hours. 

e.  Mishap  Rate  (MR):  The  total  number  of  mishaps  divided  by  the  total 
number  of  flight  hours.  This  thesis  defines  a  mishap  for  a  SUAV  as  signifieant  platform 
damage  or  a  total  platform  loss.  A  mishap  requires  repair  less  than  or  equal  to  a  erash 
depending  on  the  eondition  of  the  platform  after  the  mishap. 

d.  Operational  MR:  The  total  number  of  mishaps  divided  by  the  total 
number  of  operating  flight  hours. 

e.  Current  Crash  Rate  (CCR):  The  total  number  of  erashes  from  the  last 
system  modifieation  divided  by  the  total  number  of  flight  hours  from  the  last  system 
modifieation. 

f  Operational  CCR:  The  total  number  of  erashes  from  the  last  system 
modifieation  divided  by  the  total  number  of  operating  flight  hours  sinee  the  last 
modifieation. 

g.  Current  Mishap  Rate  (CMR):  The  total  number  of  mishaps  from  the  last 
system  modifieation  divided  by  the  total  number  of  flight  hours  from  the  last 
modifieation. 


160  Carmichael,  Bruce  W.,  Col  (Sel),  and  others,  “Strikestar  2025,”  Appendix  A,B  &  C,  “Unmanned 
Aerial  Vehicle  Reliability,”  Appendix  A,  Table  4August  1996,  Department  of  Defense,  Internet,  February 
2004.  Available  at:  http://www.au. afmiFau/2025/volume3/chapl3/v3cl3-8.htm 
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h.  Operational  CMR:  The  total  number  of  mishaps  from  the  last  system 
modifieation  divided  by  the  total  number  of  operating  flight  hours  from  the  last 
modifieation. 

i.  Crash  Rate  “X”  (CRX):  The  erash  rate  for  the  last  “X”  hours  of 
operational  flight  hours,  as  in  “CR50”  whieh  is  the  CR  for  the  last  50  flight  hours. 

j.  Mishap  rate  “X”:  The  MR  for  the  last  “X”  hours  of  operational  flight 
hours,  as  in  “MR50”  whieh  is  the  MR  for  the  last  50  flight  hours. 

k.  Aehieved  Availability  (AA):  The  total  operating  time  (OT)  divided  by 
the  sum  of  OT,  plus  the  total  eorrective  maintenanee  time,  plus  the  total  preventive 
maintenanee  time. 

l.  Pereent  Sorties  Loss:  The  total  number  of  sorties  lost  (for  any  reason) 
divided  by  the  total  number  of  sorties  assigned. 

m.  Pereent  Sorties  Mishap:  The  total  number  of  sorties  with  a  mishap 
divided  by  the  total  number  of  sorties  assigned. 

SUAVs  are  generally  low  eost  systems  with  priees  from  $15K  to  $300K.  For  that 
reason  there  is  no  official  data  collecting  system  in  effect  detailed  enough  to  provide 
reliability  data.  Usually,  only  the  number  of  flight  hours  and  the  number  of  crashes  is 
known.  For  that  reason,  the  most  suitable  reliability  MOPs  currently  are  CR,  CCR  and 
CRX. 

5.  Reliability  Improvement  Program  on  SUAVs 

A  reliability  improvement  program  seeks  to  achieve  reliability  goals  by  improving 
product  design.  The  objective  of  an  improvement  program  is  to  identify,  locate  and 
correct,  faulty  and  weak  aspects  of  the  design,  manufacturing  process,  and  operating 
procedures.  For  the  SUAV,  we  first  applied  existing  techniques  for  improving  system 
reliability. 

Starting  with  the  FMEA,  which  is  the  basis  for  the  most  common  methodologies 
for  improving  reliability;  we  also  discussed  FMECA  and  FTA.  After  that,  reliability 
centered  maintenance,  specifically  MSG-3,  was  presented  as  the  prevailing  methodology 
for  enhancing  civil  aviation  reliability  and  maintenance  preservation  methodology.  We 
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showed  that  MSG-3  is  not  suitable  for  UAVs  applieations  beeause  of  its  dependenee  on 
an  in-board  operator.  We  highlighted  the  need  for  a  data  colleetion  system  and  presented 
FRACAS.  FRACAS  is  best  suited  for  SUAVs  espeeially  during  their  initial  phases  of 
development  or  operational  test  development.  Finally,  a  method  or  teehnique  is  needed  to 
keep  track  of  reliability  growth.  Duane’s  plots  presented  and  recommended  for  their 
simplicity. 

6,  Steps  for  Improving  Reliability  on  SUAVs 

We  can  consider  a  FRACAS  system  as  a  part  of  a  generic  reliability  improvement 
program.  The  first  step  of  such  a  program  is  an  environmental  stress  screening  (ESS). 
ESS  is  a  process  that  uses  random  vibration  within  certain  operational  limits,  and 
temperature  cycling  to  accelerate  part  and  workmanship  imperfections.  Identification  of 
infant  mortality  failures  can  be  identified  in  a  short  time  and  relatively  easily. 

In  addition  to  ESS,  the  next  actions  should  be  taken: 

a.  Verify/calibrate  the  instruments  for  the  field  tests  or  field  operations. 
With  calibrated  instruments  we  can  substantially  reduce  instrumentation  errors.  A  rule  of 
thumb  is  to  use  another  instrument  that  is  at  least  10  times  more  accurate  than  the 
instrument  we  want  to  calibrate.  161 

b.  Set  the  initial  weather  restrictions  for  UAVs  flights. 

c.  Conduct  a  EMEA  of  the  system  and/or  perform  an  ETA  if  it  is  necessary 
when  we  want  to  focus  on  a  certain  failure.  Eor  that  purpose,  we  have  tailored  a  form  as 
in  Table  18.162 


161  Hoivik. 

162  Department  of  Defense,  MIL-STD-1629A,  “Procedures  For  Performing  a  Failure  Mode  Effects 
and  Criticality  Analysis,”  Task  101  FMEA  sheet,  November  24,  1980. 
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UAVs  FMEA  Form 

FMEA  Date 

System  Name 

Page  of  Pages 

Part  Name 

Prepared  by 

Reference  Drawing 

Approved  by 

Mission 

Revisited  by/Revision  Date 

ID 

Number 


Item/ 

functional 

ID 


Design  Function 


Failure 

Modes 

Causes 


and 


Operational 

Phase 


Failure  Effects 


Local 


Next  Higher  Level 


End 

Effects 


Failure 

Detection 

Method 


Fault 

Acceptance 


Severity 

Classification 


Remarks 


Table  18.  UAVs  FMEA  Form  (After  MIL-STD-1629A,  Figure  101.3) 
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The  cell  definitions  are;  163 


(1) .  ID  Number,  given  to  each  entry  on  the  FMEA  form  for  record¬ 
keeping  purposes. 

(2) .  Item/Functional  Identification,  for  the  item  or  the  functional 
block  or  subsystem,  such  as  the  carburetor  or  the  fuel  tank,  for  example. 

(3) .  Design  Function,  a  brief  statement  about  the  item’s  design 
function.  State  that  the  carburetor  mixes  fuel  and  air  in  order  to  feed  the  engine  with  the 
proper  fuel-air  density,  for  example. 

(4) .  Failure  Modes  and  Causes,  a  brief  statement  about  the  way(s) 
in  which  the  item  may  fail.  In  the  case  of  the  carburetor,  the  failure  modes  are  improper 
adjustment,  plugged  needle  valve,  jammed  leverage,  servo  failure,  excess  vibrations, 
throttle  failure,  insufficient  fastening  to  the  frame,  etc. 

(5) .  Operational  Phase,  a  brief  statement  about  the  item’s  objective 
or  task  must  be  written;  in  the  case  of  the  carburetor,  it  controls  engine  running  speed. 

(6) .  Focal  Failure  Effects,  explaining  the  immediate  consequences 
of  the  item’s  identified  failure  mode.  In  the  case  of  the  carburetor,  we  can  state  “Engine 
cannot  be  controlled.” 

(7) .  Next  Higher  Fevel,  about  the  effect  of  the  local  failure  on  the 
next  higher  functional  system  level;  in  the  case  of  the  carburetor,  we  can  state  “Foss  of 
engine.” 

(8) .  End  Effects,  explaining  the  effects  of  the  indicated  failure 
mode  on  the  whole  system.  In  the  case  of  the  carburetor,  we  can  state  “Foss  of  thrust.” 

(9) .  Failure  detection  method,  explaining  the  way(s)  by  which  a 
failure  can  be  detected.  In  the  case  of  the  carburetor,  it  could  be  detected  by  the  operator 
or  by  the  control  system  itself. 


163  The  material  from  the  following  part  of  section  is  taken  (in  some  places  verbatim)  from:  RAC 
FMECA,  pages  60-66. 
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(10) .  Fault  Acceptance,  statement  of  the  ways  that  the  system  ean 
overeome  or  bypass  the  effeets  of  failure.  In  the  ease  of  the  earburetor  the  system  design 
does  not  provide  any  alternatives  so  the  word  “None”  ean  be  plaeed  under  fault 
aeeeptanee. 

(11) .  Severity  Classifieation,  representing  the  degree  of  damage 
that  will  be  eaused  by  the  oeeurrenee  of  the  failure  mode.  It  eould  be  any  of  the  following 
eategories: 

(a)  Classifieation  I,  for  eomplete  loss  of  system 

(b)  Classifieation  11,  for  degraded  operation  of  the  system 
(e)  Classifieation  111,  for  a  failure  status  that  still  needs  to  be 

investigated 

(d)  Classifieation  IV,  for  no  effeet  on  systems  funetions. 

The  failure  effeet  for  the  earburetor  ean  be  elassified  as  a  Category  1 

severity. 

(12) .  Remarks,  relating  details  about  the  evaluation  of  the  given 

failure  mode. 

d.  Establish  a  FRACAS.  Implementation  of  FRACAS  through  the 
system’s  life  eyele,  even  for  the  ESS  tests,  should  eontinue  for  all  failures  oeeurring 
during  developmental  and  operational  testing. 

For  a  reliability  improvement  program,  FRACAS  is  the  most  eritieal 
faoet.164  Failures  must  be  identified  and  isolated  to  the  root  failure  mode.  After  the  failure 
analysis  is  eomplete,  eorreetive  aetions  are  identified,  doeumentation  is  eompleted  and 
data  is  entered  into  FRACAS.  The  system’s  manufaeturer  ean  use  the  information  in 
FRACAS  to  ineorporate  the  eorreetive  aetions  into  the  produet.  We  ean  use  the  same 
FRACAS  forms  we  presented  in  the  previous  subseetion. 

e.  Traek  of  reliability  improvement  by  using  Duane’s  theory,  MTBFs 
and/or  aehieved  availability  of  the  system. 


164  Pecht,  page  323. 
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f.  Complete  a  reliability  improvement  plan.  This  plan  must  be  completed, 


approved  and  coordinated  by  the  manufacturer’s  engineers  and  reliability  manager  in 
cooperation  with  the  military  personnel  who  operate  the  systems.  The  following  need  to 
be  addressed  in  the  plan; 

•  Resources, 

•  Test  schedule  and  test  equipment, 

•  Personnel, 

•  Test  environment, 

•  Procedures, 

•  Data  base  establishment,  and 

•  Corrective  action  implementation  program. 

Figure  37  outlines  the  reliability  improving  process  for  SUAVs. 


Verify/Calibrate 

Instruments 


Set  Initial 


b  Weather 
Restrictions 


c  Conduct  FMEA 


d 


Establish 

FRACAS 


e  Track  Reliability 


Complete 
f  Reliability 

Improvement  Plan 


Figure  37.  Reliability  Improving  Process  on  SUAVs 
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IV.  EXAMPLE 


A,  RQ-2  PIONEER  86  THROUGH  95 

From  the  US  Navy’s  Airborne  Reconnaissance  Office,  15  March  1996,  come  the 
following  data  regarding  the  RQ-2  Pioneer  battlefield  UAV  mishaps  from  1986  until 
1995.165 


Year 

Mishaps 

Flight 

hours 

86 

5 

96.3 

87 

9 

447.1 

88 

24 

1050.9 

89 

21 

1310.5 

90 

21 

1407.9 

91 

28 

2156.6 

92 

20 

1179.3 

93 

8 

1275.6 

94 

16 

1568 

95 

16 

1752 

cu 

a 

O 


14000 

12000 

10000 

8000 

6000 

4000 

2000 

0 


84 


Operating  hours  vs  time 


Operating 
hours  vs 
time 


96 


Table  19.  RQ-2  Pioneer  data 

As  discussed  in  Chapter  3,  we  can  calculate  only  the  Mishap  Rate  (MR)  and  the 
Current  Mishap  Rate  (CMR)  because  we  have  data  only  for  mishaps  and  total  flight 
hours.  Assuming  that  each  year  we  have  modifications  in  the  system,  we  calculate  the 
following; 


Mishap 

Year 

Rate  (MR) 

86 

0.051921 

87 

0.025764 

88 

0.023835 

89 

0.020311 

90 

0.01855 

91 

0.016694 

92 

0.016735 

93 

0.015239 

94 

0.014487 

95 

0.013721 

Current  Mishap 
Rate  (CMR) 

0.05192108 

0.020129725 

0.022837568 

0.016024418 

0.014915832 

0.0129834 

0.016959213 

0.006271558 

0.010204082 

0.00913242 


Table  20. 


MR  and  CMR 


165  Carmichael,  Bruce  W.,  Col  (Sel),  and  others,  “Strikestar  2025,”  Appendix  A,  B  &  C,  “Unmanned 
Aerial  Vehicle  Reliability,”  August  1996,  Department  of  Defense  school,  Internet,  February  2004. 
Available  at:  http://www.au.afmil/au/2025/volume3/chapl3/v3cl 3-8.htm 
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It  is  obvious  that  both  MOPs  provide  the  notion  of  rapid  improvement  during  the 
first  two  years  followed  by  a  mueh  slower  rate  of  improvement. 


1.  We  follow  Duane’s  theory  and  analyze  the  data  as  seen  in  Table  21.  We  assume 
that  reliability  improvement  efforts  have  been  implemented  every  year  on  all  similar 
systems. 


N 

T 

Cum  Mish 

Cum  flight  hours 

N/T 

ln(T)  ln(N/T) 

Regression 

exp(regression) 

5 

96.3 

0.051921 

4.567468  -2.95803 

-3.0499928 

0.047359265 

14 

543.4 

0.025764 

6.297846  -3.658788 

-3.4840591 

0.030682613 

38 

1594.3 

0.023835 

7.37419  -3.736604 

-3.7540609 

0.023422437 

59 

2904.8 

0.020311 

7.97412  -3.896582 

-3.9045536 

0.020149947 

80 

4312.7 

0.01855 

8.369319  -3.987293 

-4.0036897 

0.018248183 

108 

6469.3 

0.016694 

8.774823  -4.092692 

-4.1054106 

0.016483249 

128 

7648.6 

0.016735 

8.942278  -4.090248 

-4.1474168 

0.015805192 

136 

8924.2 

0.015239 

9.096522  -4.183867 

-4.186109 

0.015205334 

152 

10492.2 

0.014487 

9.258387  -4.234507 

-4.226713 

0.014600302 

168 

12244.2 

0.013721 

9.412808  -4.288844 

-4.2654495 

0.014045553 

Table  21 .  Duane’s  Theory  Data  Analysis 


The  results  from  the  regression  analysis  are  the  following: 


SUMMARY  OUTPUT 


Regression  Statistics 

Multiple  R 

0.984226284 

R  Square 

0.968701379 

Adjusted  R  Square 

0.964789051 

Standard  Error 

0.073881376 

Observations 

10 

ANOVA 


df 

SS 

MS 

F 

Significance  F 

Regression 

1 

1.351526748 

1.351526748 

247.6023 

2.65748E-07 

Residual 

8 

0.043667661 

0.005458458 

Total 

9 

1.39519441 

Coefficients 

Standard  Error 

tStat 

P-vaiue 

Intercept 

-1 .90424026 

0.129763159 

-14.6747372 

4.57E-07 

ln(T) 

-0.25085068 

0.015941821 

-15.7353841 

2.66E-07 

Table  22.  Regression  Results 
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In  that  case  a  is  -0.25  for  the  total  12,244.2  hours  of  operations.  In  the  next  figure, 
we  can  see  Duane’s  regression  and  failure  rate  versus  time  plots. 


1  100  10000 

Operating  hours 

Figure  38.  Duane’s  Regression  and  Failure  Rate  versus  Time 

From  the  residual  and  the  Duane’s  plots  we  see  a  steeper  descent  for  the  failure 
rate  in  the  first  years  followed  by  a  short  period  of  constant  failure  rate.  The  last  year’s 
failure  rate  is  not  as  steep  as  the  first  year’s. 

2.  Using  the  same  data  set,  we  concentrate  on  the  last  six  years,  from  1990  to 

1995. 


Year 

Mishaps 

Fllight  hours 

90 

21 

1407.9 

91 

28 

2156.6 

92 

20 

1179.3 

93 

8 

1275.6 

94 

16 

1568 

95 

16 

1752 

Operating  hours  vs  time 


Table  23.  RQ-2  Pioneer  Data,  1990  to  1995 


We  follow  Duane’s  theory  and  analyze  the  data  as  seen  in  the  next  table. 
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N 

T 

Cum  Mish 

Cum  flight  hours 

N/T 

ln(T)  ln(N/T) 

Regression  exp(regression)| 

21 

1407.9 

0.014916 

7.249855  -4.205332 

-4.173563 

0.015397302 

49 

3564.5 

0.013747 

8.178779  -4.286959 

-4.2891601 

0.013716442 

69 

4743.8 

0.014545 

8.464594  -4.230487 

-4.3247274 

0.013237158 

77 

6019.4 

0.012792 

8.702743  -4.358937 

-4.3543631 

0.012850622 

93 

7587.4 

0.012257 

8.934244  -4.401645 

-4.3831715 

0.012485697 

109 

9339.4 

0.011671 

9.141997  -4.450649 

-4.4090247 

0.012167039 

Table  24.  Duane’s  Theory  Data  Analysis  for  1990  to  1995 


Now  the  results  from  the  regression  analysis  follow: 


SUMMARY  OUTPUT 


Regression  Statistics 

Multiple  R 

0.864537515 

R  Square 

0.747425115 

Adjusted  R  Square 

0.684281394 

Standard  Error 

0.054749695 

Observations 

6 

ANOVA 


df 

SS 

MS 

F 

Significance  F 

Regression 

1 

0.035481414 

0.035481 

11.83689 

0.026282253 

Residual 

4 

0.011990116 

0.002998 

Total 

5 

0.047471531 

Coefficients 

Standard  Error  t  Stat 

P-vaiue 

Intercept 

-3.271377559 

0.306285094  -10.68083 

0.000435 

ln(T) 

-0.124441862 

0.036169936  -3.440478 

0.026282 

Table  25.  Regression  Results  for  1990  to  1995 


Now  the  parameter  a  is  -0.12  for  the  last  9,339.4  hours  of  operations.  That  means 
we  have  less  rapid  reliability  growth  the  last  six  years.  Figure  39  depiets  Duane’s 
regression  and  failure  rate  versus  time  plot: 
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Figure  39.  Duane’s  Regression  and  Failure  Rate  versus  Time  for  1990  to  1995 


Comparing  the  two  time  periods,  we  can  say  that  rate  of  reliability  growth  for  the 
last  six  years  (factor  of  -0.12)  from  1990  to  1995  decreased  compared  to  the  overall 
factor  -0.25  for  the  whole  ten-year  period  from  1986  to  1995. 

3.  Using  the  same  data  set,  we  concentrate  in  the  first  six  years  from  1986  to 

1991. 


Year 

Mishaps 

Fllight  hours 

86 

5 

96.3 

87 

9 

447.1 

88 

24 

1050.9 

89 

21 

1310.5 

90 

21 

1407.9 

91 

28 

2156.6 

Operating  hours  vs  time 


Table  26.  RQ-2  Pioneer  Data,  1986  to  1991 


We  follow  Duane’s  theory  and  analyze  the  data  as  seen  in  the  next  table: 
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N 

T 

Cum  Mish 

Cum  flight  hours 

N/T 

ln(T)  ln(N/T) 

Regression  exp(regression)| 

5 

96.3 

0.051921 

4.567468  -2.95803 

-3.0470131 

0.047500591 

14 

543.4 

0.025764 

6.297846  -3.658788 

-3.4860799 

0.030620672 

38 

1594.3 

0.023835 

7.37419  -3.736604 

-3.7591921 

0.023302559 

59 

2904.8 

0.020311 

7.97412  -3.896582 

-3.9114186 

0.020012093 

80 

4312.7 

0.01855 

8.369319  -3.987293 

-4.0116967 

0.018102654 

108 

6469.3 

0.016694 

8.774823  -4.092692 

-4.1145894 

0.016332645 

Table  27.  Duane’s  Theory  Data  Analysis  for  1986  to  1991 


Now  the  results  from  the  regression  analysis  follow: 


SUMMARY  OUTPUT 


Regression  Statistics 

Multiple  R 

0.975768582 

R  Square 

0.952124325 

Adjusted  R  Square 

0.940155406 

Standard  Error 

0.099437813 

Observations 

6 

ANOVA 


df 

SS 

MS 

F 

Significance  F 

Regression 

1 

0.786578144 

0.786578144 

79.54973601 

0.000873629 

Residual 

4 

0.039551515 

0.009887879 

Total 

5 

0.826129659 

Coefficients 

Standard  Error 

tStat 

P-vaiue 

Intercept 

-1.888061479 

0.209552206 

-9.009981407 

0.000840246 

ln(T) 

-0.25374049 

0.028449223 

-8.919065871 

0.000873629 

Table  28.  Regression  Results  for  1986  to  1991 

Now  a  is  -0.25  for  the  first  6469.3  hours  of  operations.  In  the  next  figure,  we  see 
Duane’s  regression  and  failure  rate  versus  time  plots: 
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Figure  40.  Duane’s  Regression  and  Failure  Rate  versus  Time  for  1986  to  1991 


failure  rate  vs  time 
Duane's  regression  line 


If  we  compare  the  first  six  years  with  the  last  six  years,  we  can  say  that  reliability 
growth  for  the  last  six  years  has  increased  according  to  the  factor  of  -0.12,  instead  of  the 
factor  -0.25,  which  related  to  the  first  six  years.  We  do  not  know  why  the  reliability 
growth  rate  has  decreased,  but  it  has. 

4.  We  can  use  the  Duane  curve  to  predict  the  MTBF  for  the  future.  From  the 
previous  discussion  on  Duane’s  plots  on  IIIB4,  MTBF  is  K-T“  where  if  =  e* .  Using  the 
results  for  the  last  six  years  we  have  a  is  -0.1244  and  b  is  -3.2714.  So  the  equation  for  the 
curve  is  MTBF  j.-o.i244  j-]^g  pj-ediction  curve  for  the 

MTBF.  For  example,  in  12,000  hours  of  operation  after  1990,  the  MTBF  will  be 
0.01 1793  failures  per  hour  of  operation  or  12  failures  per  1,000  hours  of  operation. 


Figure  4 1 .  Prediction  Plot  Curve 
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V.  CONCLUSION 


A.  SUMMARY 

From  the  material  presented  in  this  thesis,  we  ean  eonelude  the  following: 

1 .  There  is  a  real  need  for  reliability  improvement  in  Small  UAV  systems. 

2.  RCM  (or  MSG-3)  is  a  system  suitable  for  eivil  and  military  manned 
aviation  and  other  industry  fields  in  whieh  experienee  is  prevalent,  hidden  failures  can  be 
easily  identified  by  personnel,  and  safety  considerations  are  the  primary  factor.  For  small 
UAV  systems  in  military  applications,  safety  is  not  the  primary  factor.  Experience  has  not 
reached  the  manned  aviation  levels  and  hidden  failures  for  unmanned  systems  are  very 
difficult  to  be  observed.  Therefore  MSG-3  is  not  a  suitable  standard  for  SUAVs. 

3.  FMEA  may  be  used  for  almost  any  kind  of  reliability  analysis  that 
focuses  on  finding  the  causes  of  failure.  A  good  and  complete  knowledge  of  the  system  is 
necessary  prior  to  proceeding  with  the  EMEA.  EMEA  is  an  appropriate  method  for 
SUAVs.  This  thesis  has  developed  EMEA  forms  for  the  SUAV. 

4.  ETA  is  another  useful  method  of  analysis  based  on  the  top-down 
approach  and  can  be  used  to  focus  only  on  the  weak  points  that  need  enhancing.  It  is 
appropriate  for  SUAVs,  and  can  be  used  to  focus  on  engine,  control,  and  navigation 
subsystems  that  are  among  the  most  critical  elements.  We  developed  ETA  diagrams  for 
the  SUAV  in  this  thesis. 

5.  Eunctional  flow  diagrams  or  block  diagrams  are  used  to  give  a  quick 
and  comprehensive  view  of  the  system  design  requirements  illustrating  series  and  parallel 
relationships,  hierarchy  and  other  relationships  among  system’s  functions.  Since  a  SUAV 
is  essentially  a  series  system  it  is  less  useful. 

6.  ERACAS,  a  failure  reporting  analysis  and  corrective  action  system, 
should  be  implemented  for  a  program  during  production,  integration,  test,  and  field 
deployment  phases  to  allow  for  the  collection  and  analyses  of  reliability  and 
maintainability  data  for  the  hardware  and  software  items.  Eor  a  successful  reliability 
improvement  program,  all  failures  should  be  considered.  SUAVs  need  ERACAS  system. 
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This  research  effort  developed  the  framework  of  one  aircraft,  including  the  necessary 
forms. 

7.  For  SUAVs  we  have  to  use  fault  avoidance  due  to  size  and  weight 
limitations.  Redundancy  cannot  be  easily  implemented,  especially  due  to  platform  cost 
and  size  constrains. 

8.  In  a  series  structure,  like  SUAVs,  the  component  with  the  lowest 
reliability  is  the  most  important  one  for  reliability  improvement.  Currently,  there  is  no 
bans  for  estimating  the  reliability  of  a  system  for  operational  planners. 

9.  We  can  track  the  overall  reliability  of  a  system  for  SUAVs  under 
experimental  development  by  implementing  a  method  that  records  failure  data.  By 
analyzing  the  data  we  can  calculate  and  predict  reliability  growth. 

10.  Similarly  we  can  track  the  reliability  of  subsystems  of  a  SUAV 
system.  We  divide  the  system  into: 

•  Propulsion  and  power 

•  Flight  control  and  navigation 

•  Communication 

•  GCS  (Human  in  the  loop) 

•  Miscellaneous 

and  keep  track  of  the  reliability  for  each  subsystem  separately.  The  forms  that  we  have 
developed  can  be  used  as  data  source  for  subsystem  reliability  separation. 

12.  For  a  reliability  improvement  program,  we  need  to: 

•  Conduct  an  Environmental  Stress  Screening  (ESS), 

•  Calibrate  and  verily  the  instruments  for  the  field  tests  or  field  operations, 

•  Set  the  initial  weather  restrictions  for  UAVs  flights, 

•  Execute  a  EMEA  of  the  system  and/or  perform  an  ETA, 

•  Establish  a  ERACAS, 
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•  Track  of  reliability  improvement, 

•  Complete  a  reliability  improvement  plan. 

13.  Reliability  costs,  and  benefits,  are  like  an  investment.  One  truly  gets 
what  one  pays  for. 

This  thesis  is  a  qualitative  approach  to  the  issue  of  reliability  and  UAVs.  In  order 
to  obtain  further  benefit  and  value  from  that  researeh  effort,  we  must  have  data.  For  a 
speeifie  type  of  UAV,  we  ean  start  implementing  FRACAS  and  colleeting  data.  A 
database  ean  be  ereated  easily  after  the  implementation  of  FRACAS,  and  we  can  start 
analyzing  and  interpreting  reliability  improvement,  if  any,  quite  soon. 


B,  RECOMMENDATIONS  FOR  FUTURE  RESEARCHERS 

This  thesis  outlines  methods  of  improving  SUAV  reliability.  Methods  must  be 
defined  for  better  data  eolleetions.  Real  data  from  SUAV  systems  must  be  eolleeted  in 
order  to  formulate  reliability  databases.  The  quantitative  reliability  analysis  follows  and 
detailed  information  about  reliability  improvement  results. 

Researehing  many  issues  would  be  worthwhile. 

1.  SUAVs  are  considered  expendables  sinee  no  pilot  is  onboard.  As  we 
inerease  their  reliability,  their  eost,  and  their  importanee  in  the  battlefield  operations,  we 
have  to  start  considering  their  survivability.  Being  small  in  size  may  be  is  not  enough  to 
eope  with  enemy-fires.  Researehing  survivability  issues  for  SUAVs  is  another  field  of 
interest  with  many  extensions  and  relations  to  design  philosophy  and  eost. 

2.  Some  experts  believe  that  diffieult  problems  ean  be  solved  with  better 
software,  but  software  is  not  free.  In  the  near  network-eentrie  future,  software  will 
probably  be  one  of  the  most  expensive  parts  of  a  UAV  system.  Additionally,  software  is  a 
dynamie  part  of  the  system.  It  must  be  eonstantly  upgraded  to  meet  new  expeetations,  or 
to  integrate  new  equipment  teehnologies.  For  that  reason  software  reliability  is  another 
eritieal  issue  that  will  beeome  more  intense  in  the  near  future.  The  emerging  question  is 
how  we  ean  find  the  best  means  to  maintain  software  reliability  at  aeceptable  levels. 
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3.  Similar  to  the  above  issue,  miero-technologies  are  quiekly  evolving. 
New  ones  are  rapidly  being  inserted  into  UAV  systems.  In  what  way  can  our  reliability 
tracking  methodology  cope  with  new  subsystems? 

4.  If  there  is  a  need  to  achieve  a  certain  level  of  reliability,  what  would  the 
economic  consequences  be? 

5.  Generally,  it  would  be  of  great  interest  to  research  the  potential 
mechanisms  for  incorporating  new  equipment  into  a  reliability  improvement  program. 

6.  What  is  the  best  number  of  maintenance  personnel  to  keep  the  system  at 
a  given  level  of  availability? 

7.  What  should  the  spares  policy  be  for  SUAVs? 

8.  What  fraction  of  failures  are  due  to  software  instead  of  hardware 

failures? 

Data  collected  using  the  methods  developed  in  this  thesis  will  provide  the 
material  with  which  to  answer  these  essential  questions. 
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APPENDIX  A:  DEFINITION  OF  FMEA  FORM  TERMS 


1.  First  Part  of  the  Analysis  of  Design 

(1)  Subsystem  Identification:  Name  the  subsystem  or  identification  title  of 

the  FMEA. 

(2)  Design  Responsibility:  Name  the  system  design  team  and  for  (2A) 
name  the  head  of  the  system  design  team. 

(3)  Involvement  of  Others:  Name  other  people  or  activities  within  the 
company  that  affect  the  design  of  the  system. 

(4)  Supplier  Involvement:  Name  other  people,  suppliers  and/or  outside 
organizations  that  affect  the  design  of  the  system. 

(5)  Model/product:  Name  the  model  and/or  the  product  using  the  system. 

(6)  Engineering  Release  Date:  This  is  the  product  release  date. 

(7)  Prepared  by:  The  name  of  the  EMEA  design  engineer. 

(8)  EMEA  Date:  Record  the  date  of  the  EMEA  initiation. 

(9)  EMEA  Date,  revision:  Record  the  date  of  the  latest  revision. 

(10)  Part  Name:  Identify  the  part  name  or  number. 

2.  The  Second  Part  of  the  Analysis  of  Design  FMFA-?<^7 

(11)  Design  Eunction:  This  is  the  objective  function  of  the  design.  The 
function  should  be  described  in  specific  terms.  Active  verbs  defining  functions  and 
appropriate  nouns  should  be  used. 

(12)  Potential  Eailure  Mode:  The  defect  refers  to  the  loss  of  a  design 
function  or  a  specific  failure.  “Eor  each  design  function  identified  in  Item  11  the 
corresponding  failure  of  the  function  must  be  listed.  There  can  be  more  than  one  failure 
from  one  function.  ”  To  identify  the  failure  mode  ask  the  question:  “How  could  this 

166  xhe  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Stamatis,  pages  130-132. 

167  Ibid,  pages  132-149. 
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design  fail?”  or  “Can  the  design  break,  wear,  bind  and  so  on?”  Another  way  to  identify  a 
failure  mode  is  through  a  FTA.  In  a  FTA  the  top  level  is  the  loss  of  the  part  funetion  and 
the  lower  levels  are  the  eorresponding  failure  modes. 

(13)  Potential  Effeet(s)  of  failure:  This  is  the  ramifieation  of  the  failure  on 
the  design.  The  questions  usually  asked  are  “What  does  the  user  experienee  as  a  result  of 
that  failure?”  or  “What  are  the  eonsequenees  for  the  design?”  To  identify  the  potential 
effeets,  doeuments  like  historieal  data,  warranty  doeuments,  field-serviee  data,  reliability 
data  and  others  may  be  reviewed.  If  safety  is  an  issue,  then  an  appropriate  notation  should 
be  made. 

(14)  Critieal  Charaeteristies:  Examples  of  eritieal  items  may  be 
dimensions,  speeifioations,  tests,  proeesses  ete.  These  eharaeteristies  affeet  safety  and/or 
eomplianee  with  rules  and  regulations  and  are  neeessary  for  speeial  aetions  or  eontrols. 
An  item  is  indieated  eritieal  when  its  severity  is  rated  9  to  10  with  oeeurrenee  and 
deteetion  is  higher  than  3. 

(15)  Severity  of  Effeet:  Indieates  the  seriousness  of  a  potential  failure.  Eor 
eritieal  effeets  severity  is  high  while  for  minor  effeets  severity  is  very  low.  Usually  there 
is  a  rating  table,  whieh  is  used  for  evaluation  purposes.  This  table  is  made  in  sueh  a  way 
that  all  designing  issues  have  been  taken  into  eonsideration.  The  severity  rating  should  be 
based  on  the  worst  effeet  of  the  failure  mode.  An  example  of  the  severity  guideline  table 
for  design  EMEA  is  in  Table  29. 
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Effect 

Rank 

Criteria 

None 

1 

No  effect 

Very  slight 

2 

User  not  annoyed.  Very  slight  effect  on  the  product  performance. 
Non-essential  fault  noticed  occasionally 

Slight 

3 

User  slightly  annoyed.  Slight  effect  on  the  product  performance. 
Non-essential  fault  noticed  frequently. 

Minor 

4 

User’s  annoyance  is  minor.  Minor  effect  on  the  product 
performance.  Non-essential  faults  almost  always  noticed.  Fault 
does  not  require  repair. 

Moderate 

5 

User  has  some  dissatisfaction.  Moderate  effect  on  the  product 
performance.  Fault  requires  repair. 

Significant 

6 

User  is  inconvenienced.  Degradation  on  product’s  performance 
but  safe  and  operable.  Non-essential  parts  inoperable. 

Major 

7 

User  is  dissatisfied.  Major  degradation  on  product’s  performance 
but  safe  and  operable.  Some  subsystems  are  inoperable. 

Extreme 

8 

User  is  severely  dissatisfied.  Product  is  safe  but  inoperable. 

System  is  inoperable. 

Serious 

9 

Safe  operation  and  compliance  with  regulations  are  in  jeopardy. 

Hazardous 

10 

Unsafe  for  operation,  non-compliance  with  regulations, 
completely  unsatisfactory. 

Table  29.  Example  of  Severity  Guideline  Table  for  Design  FMEA  (After  Stamatis,  page 

138) 

(16)  Potential  Cause  of  Failure:  This  identifies  the  cause  of  a  failure  mode. 
For  a  failure  mode  there  may  be  a  single  cause  or  numerous  causes,  which  in  that  case  are 
symptoms,  with  one  root  cause.  A  good  understanding  of  the  system’s  functional  analysis 
is  needed  at  that  stage.  Trying  to  find  the  real  cause  can  identify  the  root  cause.  Asking 
“Why?”  five  times  is  the  rule  of  thumb  for  finding  the  cause  of  a  failure  mode.  It  is 
essential  to  identify  all  potential  failures  while  performing  the  FMEA.  There  is  not 
always  a  linear  or  “one-to-one  relationship”  between  the  cause  and  failure  mode.  Fisting 
as  many  causes  as  possible  makes  FMEA  easier  and  less  error  prone.  If  the  severity  of  a 
failure  is  rated  8  to  10,  then  an  effort  should  be  made  to  identify  as  many  root  causes  as 
possible. 

(17)  Occurrence:  This  is  the  value  that  corresponds  to  the  estimated 
frequency  of  failures  for  a  given  cause  over  the  life  of  the  design.  To  identify  the 
frequency  for  each  cause,  we  need  reliability  mathematics,  expected  frequencies  or  the 
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cumulative  number  of  eomponent  failures  per  100  or  1000  components  (CF/100  or 
CF/1000).  If  expeeted  frequeneies  and/or  the  cumulative  number  of  failures  eannot  be 
estimated,  then  alternative  systems  or  eomponents  eould  be  examined  for  similar  data  that 
eould  be  used  as  a  surrogate.  Usually,  the  assumption  of  a  single-point-failure  is  used  in 
design  FMEA.  It  is  a  eomponent  failure,  whieh  eould  eause  the  system  to  fail  and  is  not 
balaneed  by  an  alternative  method.  So  oeeurrenee  referred  to  a  single-eause-failure.  A 
guideline  for  oeeurrenee  is  shown  in  Table  30. 


Occurrence 

Rank 

Criteria 

CF/1000 

Almost  impossible 

1 

Failure  unlikely.  Historieal  data  indieate 
no  failures 

<0.00058 

Indifferent 

2 

Rare  number  of  failures  likely 

0.0068 

Very  slight 

3 

Very  few  failures  likely 

0.0063 

Slight 

4 

Few  failures  likely 

0.46 

Low 

5 

Oeeasional  number  of  failures  likely 

2.7 

Medium 

6 

Medium  number  of  failures  likely 

12.4 

Moderately  high 

7 

Moderately  high  number  of  failures 
likely 

46 

High 

8 

High  number  of  failures  likely 

134 

Very  high 

9 

Very  high  number  of  failures  likely 

316 

Almost  eertain 

10 

Failure  almost  eertain 

>316 

Table  30.  Example  of  Oeeurrenee  Guideline  Table  for  Design  FMEA  (After  Stamatis, 

page  142) 

(18)  Deteetion  Method;  This  is  a  proeedure,  test,  design  or  analysis  used  to 
deteet  a  failure  in  a  design  or  part.  It  ean  be  very  simple  or  very  diffieult,  to  identify 
problems  before  they  reaeh  the  end  user.  If  there  is  no  method,  then  “None  identified  at 
this  time”  is  the  answer.  Two  of  the  leading  questions  are  “How  ean  this  failure  be 
discovered?”  and  “In  what  way  can  this  failure  be  recognized?”  A  cheeklist  may  be 
helpful.  Nevertheless,  some  of  the  most  effeetive  ways  to  deteet  a  failure  are  simulation 
techniques,  mathematieal  modeling,  prototype  testing,  speeific  design  toleranee  studies 
and  design  and  material  review.  The  design  review  is  an  important  way  to  revisit  the 
suitability  of  the  system  or  design.  A  design  review  ean  be  quantitative  or  qualitative, 
using  a  systematie  methodology  of  questioning  and  design. 
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(19)  Detection:  Is  the  “likelihood  that  the  proposed  design  controls  will 
detect”  the  root  cause  of  a  failure  mode  before  it  reaches  the  end  user.  The  detection 
rating  estimates  the  ability  of  each  of  the  controls  in  (18)  to  detect  failures  before  it 
reaches  the  customer.  A  typical  detection  guideline  is  shown  in  Table  31. 


Effect 

Rank 

Criteria 

Almost 

certain 

1 

Has  the  highest  effectiveness 

Very  high 

2 

Has  very  high  effectiveness 

High 

3 

Has  high  effectiveness 

Moderately 

high 

4 

Has  moderately  high  effectiveness. 

Medium 

5 

Has  medium  effectiveness 

Eow 

6 

Has  low  effectiveness 

Slight 

7 

Has  very  low  effectiveness. 

Very  slight 

8 

Has  the  lowest  effectiveness 

Indifferent 

9 

It  is  unproven,  or  unreliable,  effectiveness  unknown 

Almost 

impossible 

10 

There  is  no  design  technique  available  or  known 

Table  3 1 .  Example  of  Detection  Guideline  Table  for  Design  FMEA  (After  Stamatis,  page 

147) 


(20)  Risk  Priority  Number  (RPN):  This  is  the  product  of  severity, 
occurrence,  and  detection.  RPN  is  just  a  number  that  represents  the  priority  of  the  failure. 
Reducing  RPN  is  the  FMEA’s  goal,  and  this  is  the  result  after  the  reduction  in  severity 
and/or  occurrence  and/or  detection.  So,  changing  the  design,  one  can  reduce  the  severity 
rating.  By  improving  the  requirements  and  engineering  specifications  while  focusing  on 
“preventing  causes  or  reducing  their  frequencies,”  one  can  reduce  the  occurrence  rating. 
Adding  detection  equipment  and  tools  or  “improving  the  design  evaluation  technique” 
can  reduce  the  detection  rating. 

(21)  Recommended  Actions:  These  may  be  specific  actions  or  suggestions 
for  further  study.  Recommended  actions  intend  to  reduce  the  RPN  for  the  different  failure 
modes.  Prioritization  of  failure  modes  according  to  their  RPN,  severity  and  occurrence,  is 
needed  while  conducting  a  FMEA. 
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(22)  Responsible  Area  or  Person  and  Completion  Date;  Name  the 
responsible  person/area  and  the  completion  date  for  the  recommended  action. 

(23)  Action  Taken;  This  is  about  the  follow-up  actions. 

(24)  Revised  RPN;  This  is  the  reevaluation  of  RPN  after  the  corrective 
actions  have  been  implemented.  If  the  revised  RPN  is  less  than  the  original  then  that 
indicates  an  improvement. 

3.  Third  Part  of  the  Analysis  of  Design  FMEA168 

(25)  Approval  signatures;  Name  the  authority  to  conduct  the  FMEA. 

(26)  Concurrence  signatures;  Names  there  responsible  for  carrying  out  the 

FMEA. 


168  Stamatis,  page  149. 
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APPENDIX  B:  THE  MRB  PROCESS 


The  Maintenance  Review  Process  (MRB  process)  “is  broadly  defined  as  all  of  the 
activities  necessary  to  produce  and  maintain  a  Maintenance  Review  Board  Report 
(MRBR).”  The  process  involves  three  major  objectives,  which  are  to  ensure  that: 

1.  Scheduled  maintenance  instructions  (tasks  and  intervals)  which  are 
developed  for  a  specific  aircraft,  contribute  to  the  continuing  airworthiness  and 
environmental  requirements  of  the  Regulatory  Authorities  and  the  Standards  and 
Recommended  Practices  (SARPs)  as  published  by  the  International  Civil  Aviation 
Organization  (ICAO). 

2.  The  tasks  are  realistic  and  capable  of  being  performed. 

3.  The  developed  scheduled  maintenance  instructions  may  be  performed 
with  a  minimum  of  maintenance  expense.  1 69 

“MRBRs  are  developed  as  a  joint  exercise  involving  the  air  operators,  the  type  of 
certificate  applicant,”  ATA  and  other  Regulatory  Authorities.  The  MRB  process 

consists  of  a  number  of  specialist  working  groups  who  use  an 
analytical  logic  plan  to  develop  and  propose  maintenance/inspection  tasks 
for  a  specific  aircraft  type.  The  proposed  tasks  are  presented  to  an  Industry 
Steering  Committee  (ISC)  who,  after  considering  the  working  group 
proposals,  prepares  a  proposal  for  the  MRBR. 

The  MRB  chairperson  reviews  the  proposed  MRBR,  which  is  then  published  as 
the  MRBR.  170 


169  Transport  Canada  Civil  Aviation  (TCCA),  Maintenance  Instruction  Development  Process,  TP 
13850,  Part  B,  “The  Maintenance  Review  Board  (MRB)  Process(TP  13850),  Chapter  1.  General,”  last 
updated:  April  19,  2003,  Internet,  February  2004.  Available  at:  http://www.tc.gc.ca/civilaviation 
/maintenance/aarpd/tpl3850/partB.htm 

170  TCCA,  Chapter  2. 
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APPENDIX  C:  FAILURES 


1.  Functionsi7i 

A  function  statement  should  eonsist  of  a  verb,  an  objeet  and  a  desired  standard  of 
performanee.  For  example:  A  SUAV  platform  flies  up  to  4,000  feet  at  a  speed  of  at  least 
on  55  knots  sustained.  The  verb  is  “fly”  while  the  object  is  “a  SUAV  platform”  and  the 
standard  is  “up  to  4,000  feet  at  a  sustained  speed  of  at  least  55  knots.” 

2.  Performance  Standardsi^i 

In  our  example:  One  proeess  that  degrades  the  SUAV  platform,  in  other  words 
one  failure  mode  for  the  SUAV,  is  engine  failure.  Engine  failure  happens  due  to  many 
reasons.  The  question  is  how  mueh  an  engine  failure  ean  impair  the  ability  of  the  UAV  to 
fly  at  the  desired  altitude  on  the  designated  sustained  speed. 

In  order  to  avoid  degradation,  the  SUAV  must  be  able  to  perform  better  than  the 
minimum  standard  of  performanee  desired  by  the  user.  What  the  asset  is  able  to  deliver  is 
known  as  its  “initial  eapability,”  say  4,500  feet  on  60  knots  sustained  speed.  This  leads 
one  to  define  performanee  as: 

•  Desired  performanee,  which  is  what  the  user  wants  the  asset  to  do  (4,000 
feet  on  55  knots  sustained  speed  in  our  case). 

•  Built-in  eapaeity,  whieh  is  what  the  asset  really  is  (4,500  feet  on  60  knots 
sustained  speed  in  our  ease). 

3.  Different  Types  of  Functionsi73 

Every  physieal  asset  usually  has  more  than  one  funetion.  If  the  objective  of 
maintenanee  is  to  ensure  that  the  asset  ean  eontinue  to  fulfill  these  funetions,  then  they 
must  all  be  identified  together  with  their  eurrent  standards  of  performanee. 

Eunctions  are  divided  in  two  main  categories:  primary  and  seeondary  funetions. 

171  Moubray,  John,  an  excerpt  of  the  first  chapter  of  the  book  “Reliability-centered  Maintenance,” 
Plant  Maintenance  Resource  Center,  “Introduction  to  Reliability-centered  Maintenance,”  Revised 
December  3,  2002,  Internet,  May  2004,  Available  at:  http://www.plant-maintenance.com/RCM-intro.shtml 

172  Moubray,  “Introduction.” 

173  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Moubray,  “Introduction.” 
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a.  Primary  functions  are  fairly  easy  to  reeognize  and  most  industrial  assets 
are  based  on  their  primary  functions.  For  example,  the  primary  function  of  a  “printer”  is 
to  print  doeuments,  and  of  a  “crusher”  is  to  crush  something,  etc.  In  the  SUAV  example 
the  primary  funetion  is  to  provide  lift  and  thrust  so  as  the  platform  flies  up  to  4,000  feet  at 
a  sustained  speed  of  at  least  55  knots. 

b.  In  addition  to  their  primary  funetions,  most  assets  are  expeeted  to  fulfill 
one  or  more  additional  funetions,  whieh  are  the  seeondary  funetions.  For  example,  the 
primary  funetion  of  the  SUAV  platform  in  the  example,  is  to  provide  thrust  and  lift  so  as 
to  fly  up  to  4,000  feet  on  55  knots  sustained  speed  at  least.  A  seeondary  funetion  eould  be 
to  use  an  auto-reeovery  system.  Seeondary  funetions  eould  inelude  environmental 
expectations,  safety,  eontrol,  eontainment,  and  comforts  aspects,  appearance,  proteetion, 
economy,  efficiency  and  other  extra  funetions. 

4.  Functional  Failuret74 

If,  for  any  reason,  the  asset  is  unable  to  do  what  the  user  wants,  the  user  will 
eonsider  it  to  have  failed.  “Failure  is  defined  as  the  inability  of  any  asset  to  do  what  its 
users  want  it  to  do.”  This  definition  treats  the  eoneept  of  failure  as  if  it  applies  to  an  asset 
as  a  whole. 

However,  eaeh  asset  has  more  than  one  function,  and  each  function  often  has 
more  than  one  desired  standard  of  performance.  It  is  possible  for  the  asset  to  fail  for  eaeh 
funetion,  so  the  asset  ean  fail  in  different  states.  Therefore,  it  is  required  that  failure  ean 
be  defined  more  aecurately  in  terms  of  loss  of  specifie  funetions  rather  than  the  failure  of 
an  asset  as  a  whole. 

Aeeording  to  British  Standard  (BS)  4778  failure  is  defined  as  “The  termination  of 
an  item’s  ability  to  perform  a  required  funetion.” 

5,  Performance  Standards  and  Failurest^s 

The  limit  between  satisfaetory  performanee  and  failure  is  specified  by  a 
performanee  standard.  Failure  ean  be  defined  by  defining  a  functional  failure  as  follows: 


174  jjie  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Moubray,  “Introduction.” 

175  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Hoyland,  pages  1 1-12. 
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A  functional  failure  is  defined  as  the  inability  of  any  “asset  to  fulfill  a  function  to  a 
standard  of  performance,  whieh  is  aeceptable  to  the  user.”  176 


A  failure  eould  have  different  aspects  of  functional  failure: 

•  Partial  and  total  failure 

•  Upper  and  lower  limits 

•  Gauges  and  indieators 

•  The  operating  eontext 

Failures  may  be  elassified  in  many  different  ways: 

a.  Sudden  versus  gradual  failures 

b.  Hidden  versus  evident  failures 


c.  Aeeording  to  effects  of  severity 

(1) .  Critieal  failure:  A  failure  that  is  sudden  and  eauses  termination 
of  one  or  more  primary  functions. 

(2) .  Degraded  failure:  A  failure  that  is  gradual  and/or  partial. 

(3) .  Ineipient  failure:  A  deficiency  in  the  condition  of  an  item  so 
that  a  eritieal  or  degraded  failure  ean  be  expected  unless  eorreetive  aetion  is  not  taken. 

d.  Another  classifieation  aeeording  to  the  effeets  of  severity  by  US  Mil-Std 
882,  “System  Safety  Program  Requirements”: 

(1)  Catastrophic,  which  results  in  loss  of  life  and/or  loss  of  system. 

(2)  Critieal,  which  results  in  severe  injury  and/or  illness  and/or 
severe  system  damage. 

(3)  Marginal,  whieh  results  in  minor  injury  and/or  illness  and/or 
minor  system  damage. 

(4)  Negligible  with  less  than  minor  results. 

176  The  material  from  this  part  of  seetion  is  taken  (in  some  plaees  verbatim)  from:  Aladon  Ltd, 
“Introduetion.” 
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e  Another  classification  according  to  the  effects  of  severity: 

(1)  Primary  failure  due  to  aging. 

(2)  Secondary  failure  due  to  excessive  stresses. 

(3)  Command  fault  or  transient  failures  due  to  improper  control 

signal  or  noise. 

6,  Failure  Modest77 

“Once  each  functional  failure  has  been  identified,  the  next  step  is  to  try  to  identify 
all  the  events  that  are  reasonably  likely  to  cause  each  failed  state.  These  events  are  known 
as  failure  modes.”  Failure  modes  are  those  that  have  occurred  on  the  same  or  similar 
equipment  operating  with  the  same  parameters  and  conditions,  failures  that  can  be 
prevented  by  existing  maintenance  policies,  and  failures  that  have  not  yet  happened  but 
they  can  be  considered  as  likely  to  happen.  178 

Failure  mode  is  “the  effect  by  which  a  failure  is  observed  on  the  failed  item.” 
Technical  items  are  designed  to  perform  one  or  more  functions.  So  a  failure  mode  can  be 
defined  as  nonperformance  of  one  of  these  functions.  Failure  modes  may  generally  be 
subdivided  as  “demanded  change  of  state  is  not  achieved”  and  “change  of  conditions.” 

For  example,  an  automatic  valve  may  show  one  of  the  following  failure  modes: 

a.  Fail  to  open  on  command 

b.  Fail  to  close  on  command 

c.  Leakage  in  closed  position 

The  first  two  failure  modes  are  “demanded  change  of  state  is  not  achieved”  while 
the  third  one  is  “change  of  condition.” 

7.  Failure  Effectsi79 


17V  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Hoyland,  page  10. 

178  The  material  from  this  part  of  section  is  taken  (in  some  places  verbatim)  from:  Aladon  Ltd, 
“Introduction,”  page  5. 

179  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Aladon  Ltd, 
“Introduction,”  page  5. 
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The  fourth  of  the  seven  questions  in  the  RCM  proeess,  as  previously  mentioned  in 
IIA2b  of  this  thesis,  is  listing  “What  happens  when  eaeh  failure  oecurs?”  These  are 
known  as  “failure  effeets.” 

Failure  effeets  deseribe  what  happens  when  a  failure  oeeurs.  While  deseribing  the 
effects  of  a  failure,  the  following  should  be  recorded; 

a.  What  is  the  evidence  that  the  failure  has  happened? 

b.  In  what  way  does  it  pose  a  threat  to  safety  or  the  environment? 

c.  In  what  way  does  it  affect  production  or  operation? 

d.  What  physical  damage  is  caused  by  the  failure? 

e.  What  must  be  done  to  repair  the  failure? 

8.  Failure  Consequencesiso 

Failures  affect  output,  but  other  factors  such  as  product  quality,  customer  service, 
safety  or  environment  also  influence  output.  The  nature  and  severity  of  these  effects 
govern  the  consequences  of  the  failure.  The  failure  effects  tell  us  what  happens,  and  when 
a  failure  occurs.  The  consequences  describe  how  and  how  much  it  matters.  For  example, 
if  we  can  reduce  the  occurrence  (frequency)  and/or  severity  of  failure  effects,  then  we  can 
reduce  the  consequences. 

Therefore,  if  a  failure  matters  very  much,  efforts  will  be  made  to  mitigate  or 
eliminate  the  consequences.  On  the  contrary,  if  the  failure  is  of  minor  consequence,  no 
proactive  action  may  be  needed. 

A  proactive  task  is  worth  doing  if  it  reduces  the  consequences  of  the  failure  mode 
and  justifies  the  direct  and  indirect  costs  of  doing  the  task. 

Failure  consequences  could  be  classified  as: 

a.  Environmental  and  safety  consequences,  when  it  is  not  able  to  fulfill  the 
local  and/or  national  and/or  international  environmental  standards,  or  if  the  failure  causes 
injury  and/or  death. 


180  jiie  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Aladon  Ltd, 
“Introduction,”  page  5. 
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b.  Operational,  if  the  failure  affeets  the  operation,  produetion  output, 
quality,  eost  or  eustomer  satisfaction. 

c.  Non-operational,  when  only  maintenance  and/or  repair  is  involved, 
without  affecting  the  environmental,  safety  or  production. 

d.  Hidden,  when  failures  have  no  direct  impact,  but  they  expose  the 
organization  to  multiple  failures  with  serious  and  often  catastrophic  consequences. 
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APPENDIX  D:  RELIABILITY 


1.  Introduction  to  Reliabilityisi 

Reliability  is  a  concept  that  has  dominated  systems  design,  performance  and 
operation  for  the  last  60  years.  It  appeared  after  WWI,  when  it  was  used  to  compare 
operational  safety  of  one,  two,  three,  and  four-engine  airplanes.  At  that  time  reliability 
was  measured  as  the  number  of  accidents  per  flight  hour. 

During  WWII,  a  group  of  scientists,  under  Wernher  von  Braun  in  Germany, 
developed  the  V-I  missile.  After  the  war  it  was  reported  that  the  first  ten  V-I  missiles 
were  all  ridiculous  failures.  All  of  the  first  missiles  either  exploded  on  the  launching  rail, 
or  landed  earlier  than  planned,  in  the  English  Channel.  It  was  the  mathematician  Robert 
Lusser  who  analyzed  the  missile  system  and  derived  the  “product  probability  law  of 
series  components.”  The  theorem  states  that  “a  system  is  functioning  only  if  all  the 
components  are  functioning  and  is  valid  under  special  assumptions.”  It  simply  says  that 
the  reliability  of  the  system  is  equal  to  the  product  of  the  system’s  individual  components 
reliabilities.  If  the  system  has  many  components,  then  its  reliability  is  rather  low,  even 
though  the  individual  components  have  high  reliabilities. 

In  order  to  avoid  low  system  reliability,  engineers  in  the  USA,  at  that  time  tried  to 
improve  the  individual  system’s  components.  They  used  “better”  materials  and  “better” 
designs  for  the  products.  The  result  was  higher  system  reliability  but  broad  and  further 
analysis  of  the  problem  was  not  performed. 

By  the  end  of  1950s  and  early  1960s,  interest  in  the  USA  focused  on  production 
of  the  intercontinental  ballistic  missile  and  space  research  like  the  Mercury  and  Gemini 
programs.  In  the  race  to  put  a  man  on  the  moon,  a  reliable  program  was  very  important. 
The  first  association  for  engineers  working  with  reliability  issues  was  established.  lEEE- 
Transactions  on  Reliability  was  the  first  journal  published  on  the  subject  in  1963.  After 
that,  a  number  of  textbooks  were  published  and  in  the  1970s  many  countries  from  Europe 


181  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Hoyland,  pages  1-2. 
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and  Asia  began  dealing  with  the  same  issues.  Soon  it  beeame  elear  that  a  low  reliability 
level  eannot  be  eompensated  by  extensive  maintenance. 

2,  What  is  Reliability? 

“Until  the  1960s,  reliability  was  defined  as  the  probability  that  an  item  will 
perform  a  required  function  under  stated  conditions  for  a  stated  period  of  time.” 
According  to  the  International  Standard  Organization  (ISO)  8402  and  British  Standard 
(BS)  4778,  “reliability  is  the  ability  of  an  item  to  perform  a  required  function,  under 
given  environmental  and  operational  conditions  and  for  a  stated  period  of  time.”  The  term 
“item”  is  used  to  denote  any  component,  subsystem  or  an  entity  system.  A  “required 
function”  may  be  a  single  function  or  a  combination  of  functions  necessary  to  provide  a 
certain  service.  1 82 

For  a  defense  acquisition  system,  reliability  is  a  measure  of  effectiveness.! 83  It  is 
one  of  the  “ilities”  that  a  system  needs  to  comply  with,  in  order  to  be  operationally 
suitable. 

We  can  keep  track  of  reliability  by  measuring  or  calculating  some  measures  of 
performance  such  as: 

a.  The  probability  of  completing  a  mission 

b.  The  number  of  hours  without  a  critical  failure  under  specified  mission 
conditions  or  mean  time  between  critical  failures  (MTBCF) 

c.  The  probability  of  success  as  the  number  of  successes  divided  by  the 
total  number  of  attempts 

d.  The  mean  time  to  failure  (MTTF) 

e.  The  failure  rate  (failures  per  unit  time) 

f.  The  probability  that  the  item  does  not  fail  in  a  time  interval. 

3.  System  Approach 

A  system  is  a  group  of  elements,  parts,  or  components  that  work  together  for  a 
specified  purpose.  A  failure  of  the  system  is  related  at  least  to  one  of  its  parts  or  elements 

182Hoyland,  page  3. 

183  Hoivik,  slide  6. 
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or  components  failure.  A  part  starts  at  its  working  state  and  for  various  reasons  changes 
to  a  failed  state  after  a  certain  time.  The  time  to  failure  is  considered  a  random  variable 
that  we  can  model  by  a  failure-distribution  function.  1 84 

Failure  oecurs  due  to  a  complex  set  of  interactions  between  the  material  properties 
and/or  physical  properties  of  the  part  and/or  stresses  that  act  on  the  part.  The  failure 
process  is  complex  and  is  different  for  different  types  of  parts  or  elements  or 

components.  185 

The  strength  or  endurance  of  a  part  may  be  signifieantly  and  unpredictably  varied 
because  of  manufacturing  variability.  So  that  strength,  say  “X”,  must  be  modeled  as  a 
random  variable.  When  the  system  is  being  used  it  is  subjected  to  a  stress,  say  “Y”.  If 
“X”  is  less  than  “Y”,  then  the  part  fails  immediately  because  its  strength  is  not  enough  to 
withstand  the  magnitude  of  stress  “Y”.  If  “Y”  is  less  than  “X”,  then  the  strength  of  that 
part  is  enough  to  withstand  the  stress  and  the  part  is  functional. 

Even  though  the  failure  meehanisms  vary,  they  are  basically  divided  into  two 
eategories,  the  overstress  and  the  wear-out.  The  overstress  failures  are  those  due  to 
fracture,  yielding,  buckling,  large  elastie  deformation,  electrieal  overstress,  and  thermal 
breakdown.  Wear-out  failures  are  those  due  to  wear,  corrosion,  metal  migration,  inter- 
diffusion,  fatigue-eraek  propagation,  diffusion,  radiation,  fatigue-orack  initiation  and 

creep.  186 

For  multi-eomponent  systems  like  a  SUAV  the  number  of  parts  may  be  very  large 
and  a  multilevel  deeomposition  of  sueh  a  system  is  necessary. 

4.  Reliability  Modeling 

a.  System  Failures^^^ 

System  failures  for  a  multi-eomponent  system  can  be  modeled  in  several 
ways.  A  system  failure  is  due  to  the  failure  of  at  least  one  of  its  components.  So  analysis 


184  Hoy  land,  page  18. 

185  Pecht,  page  93. 

186  Ibid,  page  96. 

187  The  material  from  this  seetion  is  taken  (in  some  plaees  verbatim)  from:  Blisehke,  pages  204-205. 
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of  failures  at  the  component  level  is  the  initial  point  of  a  failure  system  analysis.  “Henley 
and  Kumamoto  (1981)  propose  the  following  classification  of  failures: 

(1)  Primary  failure 

(2)  Secondary  failure 

(3)  Command  fault” 

Primary  or  “natural”  is  when  the  component  fails  due  to  natural  causes 
like  aging.  In  that  case,  replacement  of  the  aging  component  is  the  remedy. 

Secondary  or  “induced”  is  the  failure  of  a  component  due  to  excessive 
stress  resulting  from  the  primary  failure  of  some  other  component(s)  and/or 
environmental  factors  and/or  user  actions. 

“Command  fault  occurs  when  a  component  is  in  not  working  state  because 
of  improper  control  signals  or  noise.”  This  can  be  due  to  a  user’s  faulty  operation  or  a 
logic  controller’s  faulty  operation  signal. 

b.  Independent  vs  Dependent  Failures^^^ 

The  failure  times  of  components  are  often  influenced  by  environmental 
conditions.  As  the  environment  becomes  “harsher,  the  time  it  takes  to  reach  a  failure 
decreases.”  Thus  if  the  system’s  components  share  the  same  environment  their  failure 
times  are  statistically  dependent.  If  the  dependence  is  weak,  it  can  be  ignored  and  failure 
times  can  be  treated  as  statistically  independent.  In  that  way  failure  times  can  be  modeled 
separately  using  univariate  failure-distribution  functions.  But  in  case  of  significant 
dependence,  multivariate  failure  distributions  must  be  used  and  modeling  becomes  much 
more  complicated. 

c.  Black-Box  Modelingl  89 

A  system  failure  is  due  to  the  failure  of  one  or  more  of  its  components. 
“The  number  of  failed  components  that  must  be  restored  to  their  working  state  is  usually 
small  relative  to  the  total  number”  of  the  system’s  components.  Replacing  or  repairing 
the  defective  component(s)  restores  the  system  to  its  operational  state.  If  the  restoration 

188  xhe  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Blischke,  page  205. 

189  Ibid. 
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time  is  very  small  relative  to  the  mean  time  between  failures,  then  it  can  be  ignored,  and 
we  can  model  the  failure  system  as  a  function  reflecting  the  effect  of  age.  In  other  words, 
the  model  function  can  be  viewed  as  the  failure  rate  of  the  system  through  time. 

After  overhauls  or  major  repairs  or  design  alterations  the  failure  rate  of  the 
system  can  be  significantly  reduced.  Usually,  it  becomes  smaller  than  the  failure  rate 
before. 

Therefore,  in  black-box  modeling  we  can  collect  data  through  the  life 
cycle  time  of  a  system  and  find  a  function  that  is  the  failure  rate  through  time.  A  lot  of 
data  is  needed  in  order  for  the  function  to  be  precisely  estimated,  so  black-box  modeling 
is  not  recommended  for  the  design  and  development  phase  of  a  system  because  of  the 
changes  that  continuously  alter  the  failure  rate. 

d.  White-Box  Modelingl  90 

“In  a  white-box  modeling,  system  failure  is  modeled  in  terms  of  the 
failures  of  the  components  of  the  system.”  We  can  reach  system  failures  from  component 
failures  using  the  bottom-up  (or  forward)  approach  or  the  top-down  (or  backwards) 
approach.  In  the  forward  approach,  we  start  with  part-level  failures,  and  then  we  proceed 
to  the  system  level  to  evaluate  the  consequences  of  such  failures  on  the  system’s 
performance.  FMEA  uses  this  approach.  In  the  backward  approach,  we  start  at  the  system 
level  failures,  and  then  we  proceed  downward  to  the  part  level  to  relate  pure-system 
performance  to  part-level  failures.  FTA  uses  this  approach. 

“The  linking  of  the  system  performance  to  failures  at  the  part  level  can  be 
done  either  qualitatively  or  quantitatively.”  In  the  qualitative  case,  we  are  interested  in  the 
causal  relations  between  failures  and  system  performance.  In  the  quantitative  case,  we 
can  use  many  measures  of  system  effectiveness,  like  reliability,  in  terms  of  component 
reliabilities. 

For  an  example  assuming  independent  failures,  if  a  machine  has  a  failure 
rate  of  1  failure  every  1 00  days  then  the  probability  of  having  a  failure  on  any  day  is 
1/100.  If  a  second  redundant  machine  has  the  same  failure  rate,  then  a  system  that 


190  Blischke,  pages  206-207. 
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consists  of  both  those  machines  has  a  probability  that  both  machines  fail  on  the  same  day 
as  1/100  squared  or  1/10,000. 

e.  Reliability  Measures^^^ 

In  order  to  understand  the  reliability  measures,  we  must  determine  the 

“time-to-failure”  as  a  basic  step.  Time-to-failure  of  a  system  or  component  or  part  or  unit 

or  element  (system)  is  the  time  elapsing  from  when  the  system  is  put  into  operation  until 

the  first  failure.  Let  t=0,  the  operation  starting  time.  The  time  to  failure  is  subject  to  many 

variables.  Consequently,  we  can  represent  time-to-failure  as  a  random  variable  T.  We  can 

describe  the  condition  or  state  of  the  system  at  time  t  by  the  condition  random  variable 

f  1  if  the  system  is  functioning  at  t  1 

X(t)  where  X{t)  =  \  > 

[0  if  the  system  is  in  failed  condition  at 

The  graphical  representation  oiX(t)  versus  time  t  is  shown  in  Figure  43. 


Figure  42.  Condition  Variable  Versus  Time. (From  Hoyland,  page  18) 

The  time-to-failure  may  not  always  be  measured  in  time  but  can  also  be 
measured  in  numbers  of  repetitions  of  operation,  or  distance  of  operation,  or  number  of 
rotations  of  a  bearing,  etc.  We  can  assume  that  the  time-to-failure  T  is  continuously 
distributed  with  a  probability  density  f(t)  and  distribution  function  : 


191  The  material  from  this  section  is  taken  (in  some  places  verbatim)  from:  Hoyland,  pages  18-25. 
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F{t)  =  P{T  <t)  =  ^f  {u)du  for  >  0  . 
0 


The  probability  density  f(t)  is  defined  as  ; 


. ,  ,  d  F(t  +  At)-F(t)  P(t<T<t  +  At) 

f  (t)  =  —F(t)  =  hm  — ^  =  hm  — - - 

dt  At  At 


IfzlZ  is  small  then: 


f{t)-At  =  P{t<T<t  +  At). 

A  typieal  distribution  funetion  F(t)  and  the  eorresponding  density  function 
f(t)  are  shown  in  Figure  44. 


Figure  43.  Distribution  and  Probability  Density  Functions  (From  Hoyland,  page  18) 

There  are  three  important  measures  of  reliability: 

(1)  The  reliability  or  survivor  function  R(t) 

(2)  The  failure  rate  z(t) 

(3)  The  mean  time  to  failure  (MTTF) 
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(1).  Reliability  or  Survivor  Function  R(t).  The  reliability  function 
of  a  system  is  defined  as: 

R(t)  =  1  -  F(t)  =  P(T  >t)fovt>0 

So  R(t)  is  the  probability  that  the  system  has  operated  without 
failure  in  the  time  interval  (0,t].  Equivalently  we  can  say  that  R(t)  is  the  probability  that 
the  unit  survives  in  the  time  interval  (0,t].  The  reliability  function  R(t)  is  also  called  the 
“survivor  function”.  A  typical  reliability  function  that  corresponds  to  the  distribution 
function  of  Figure  43  can  be  seen  in  Figure  44. 


1.0 


q: 


0.5 


0.0 


Figure  44.  Typical  Distribution  and  Reliability  Function 


(2).  Failure-rate  or  Hazard  Function.  The  probability  that  a  system 
will  fail  in  the  time  interval  (t,  t+At\,  given  that  it  is  in  operating  condition  at  time  t,  is 


P(t  <T  <t  +  At\T>t)  = 


P{t<T<t  +  At) 
P{T  >  t) 


F{t  +  At)-F{t) 
R{t) 


Failure-rate  z(t)  is  the  limit  as  At  ^  0  of  probability  that  a  system  will  fail  in  the  interval 
{t,  t+Af\,  given  that  it  is  in  operating  condition  at  time  t,  per  unit  length  of  time.  If  this 
unit  length  of  time  approaches  0,  then  we  have  the  following  expression  for  the  failure 


P{t<T<t  +  At\T>t)  F{t  +  At)-F{t)  1  /(t) 

rate:  z(t)  =  lim  — ^ ^ ^  =  hm  — ^ ^ ^ ^  z(t)  = 

At  At  Pit)  Pit) 


(B) 


because  it  is  known  that  f(t)  =  lim  ^  or  equivalently 

A,^0  At 
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f(t)  =  ^F(t). 

at 


(C) 


From  the  above  it  is  implied  that  when  At  is  small; 
P{t  <T  <t  +  lS.t\T  >t)~  z{t)  -At .  So  the  eonditional  probability  is  approximately  equal  to 
the  failure  rate  z(t)  at  time  t,  times  the  length  of  the  interval  At. 

From  (A)  and  (C)  we  get:  /(t)  =  —  (1  -  R{t))  =  -R \t)  . 

dt 


So  (B)  beeomes: 

z(t)  =  — =  -  — lni?(t)^  sinoeR(0)=l,  \z{t)dt  =  -\nR{t)  ,  so 
R{t)  dt  J 

t 

R{t)  =  e  ^  .  Finally  we  have: 


7  —jz(u)du  -^z{u)du 

fit)  =  -R  '(t)  =  -  — (e  »  )  ^  fit)  =  zit)e  "  ,  t>0. 

dt 

So  the  failure-rate  or  hazard  function  is  very  useful  for  modeling, 
because  everything  else  can  be  derived  from  that. 

In  the  following  table  the  relationships  between  the  distribution 
function  F(t),  the  density  function  f(t),  the  reliability  or  survivor  function  R(t),  and  the 
failure -rate  or  hazard  function  z(t)  are  presented.  1 92 


192  Hoy  land,  page  22. 
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F(t) 

f(t) 

R(t) 

z(t) 

F(t)= 

t 

1  f{u)du 

0 

t 

-J*  z(u)du 

\-e  ° 

f(t)= 

^F(t) 

dt 

1 

t 

— J  z{u)du 

z(t)e  ^ 

R(t)= 

\-F{t) 

00 

J  f{u)du 

t 

t 

-^ziu)du 

e  ' 

z(t)= 

dF{t)  /  dt 

fit) 

00 

1  f(u)du 

t 

1 

1 

\-F{t) 

Table  32.  Relationships  Between  Funetions  F(t),  R(t),  f(t),  z(t)  (From  Hoyland,  page  22) 


For  the  most  meohanieal  and  eleetronie  systems  the  failure  rate 
over  the  life  of  the  system  has  three  diserete  periods,  eharaeterized  by  the  well  known 
“Bathtub  Curve,”  shown  in  Figure  45-193 


Figure  45.  The  Bathtub  Curve 


193  RAC  Toolkit,  page  38. 
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Infant  mortality  is  the  first  phase  of  the  bathtub  curve  where  the 
failure  rate  is  high  because  of  early  manufacturing  tolerances  and  inadequate 
manufacturing  skills.  The  failure  rate  is  decreasing  through  time  because  of  the  maturity 
of  the  design  and  the  manufacturing  process.  Useful  life  is  the  second  phase,  which  is 
characterized  as  a  relative  constant  failure  rate.  Wear-out  is  the  last  phase  where 
components  start  to  deteriorate  to  such  a  degree  that  they  have  reached  the  end  of  their 
useful  life.  This  can  be  modeled  either  piece-wise  or  as  the  sum  of  three  failure-rate  or 

-  z{u)du 

hazard  functions,  one  for  each  phase.  Then  R{t)  =  e  “  and 


z(t)  =  ^ 


z^(t),  t  <  a  ^ 

z^{t),  a<t  <b  ,or  z{t)  = 
z^{t),  t>b 


These  concepts  are  illustrated  in  Figure  45. 

(3).  Mean-Time-to-Failure  (MTTF).  The  MTTF  of  a  system  is  the 

expected  value  of  T,  which  is  given  by  the  density  function  f(t)  and  is  defined  as: 

00 

MTTF  =  E{T)  =  J  tf(t)dt .  (D) 

0 


If  the  time  needed  to  repair  or  replace  a  failed  system  is  very  short 
relative  to  MTTF,  then  the  mean  time  between  failures  (MTBF)  is  represented  by  MTTF. 
If  the  repair  time  is  comparable  to  MTTF,  then  the  MTBF  also  includes  the  mean  time  to 
repair  (MTTR).  These  concepts  are  illustrated  in  Figure  46. 
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Figure  46.  MTTF,  MTTR,  MTBF.  (From  Hoyland,  page  25) 


Beeause  f(t)=-R  ’(t)  (D)  beeomes  : 


MTTF  = 


00  00 

-J  tR \t)dt  =  “J  ^ 
0  0 


dR(t) 

dt 


dt  = 


-[^i?(0]o  +  J  R{t)dt,  by  partial  integration , 

0 


and  if 


MTTF  <  00  whieh  is  what  is  happening  in  reality,  then  -[tR(0]o  =  0  so 

00 

MTTF  =  ^R{t)dt  also.  (E) 

0 

f  Structure  Functions 

The  system  and  each  component  may  only  be  in  one  of  two  states, 
operable  or  failed.  Let  Xi  indicate  the  state  of  component  i,  for  1  <  z  <  n  ,  and 

1 1  if  component  i  works  ,  ,  \  ■  .u  .  .  .  . 

~  I  0  if  cornponent  i  failed  x  =  is  the  component  state  vector. 


The  state  of  the  system  is  also  a  binary  random  variable,  which  is 
determined  by  the  states  of  its  components. 

O  =  0(x)  =  system  state  = 

O  =  0(x)  =  is  the  structure  function  of  the  system.i94 


194  Kuo,  W.,  and  Zuo,  J.  M.,  Optimal  Reliability  Modeling,  John  Wiley  &  Sons,  2003,  page  87. 
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A  series  system  with  n  eomponents,  works  if  and  only  if  each  of  its  n 
components  work,  and  fails  whenever  any  of  its  components  fails.  The  structure  function 
for  a  series  system  is 

0  =  0(x)  =  XiX2-...-x„  =fjx,  195. 

i=\ 

It  cannot  usually  be  predicted  with  certainty  whether  or  not  a  given 
component  will  be  in  a  failed  state  after  t  time  units.  So  we  interpret  the  state  variables  of 
the  n  components  at  time  t  as  random  variables,  and  we  denote  them 
asXj(t),X2(0,...^„(0 . 

Now  we  focus  on  the  following  probabilities; 

P{Xi{t)  =  \)  =  p.{t)  for  /  =  l,2,...n ,  which  is  the  component’s  i  reliability, 
andT’(0(X(t))  =  1)  =  ps{t) ,  which  is  the  system’s  reliability. 

For  the  state  variables  X.{t)  for  /  =  l,2,...n ,  we  have 
piit)  =  E[X,  (0]  =  0  •  P(X.  (0  =  0)  + 1  •  (0  =  1),  for  /  =  1, 2, . .  .n 

For  the  system  reliability  at  time  t,  we  have; 

=  E[0(X(t))]  where  X{t)  =  {X^{t),X^{t),...Xn{t)) , 

Assuming  that  X^{t),X^{t),...X^{t)  are  independent,  the  system 

n  n 

reliability  is  ps(t)  =  n  p,it)  or  R{t)  =  r,{t)-r^{t)-...-rXt)  =n  r.{t),  where  R(t)  is  the 

/=1  i=\ 

system’s  reliability  and  r/t)  is  the  component’s  reliability  for  a  series  system.  196 


195  Hoyland,  page  99. 

196  Ibid,  page  127-129. 
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g.  Series  System  Reliability  Function  and MTTFi^J 
From  Table  32  we  find  the  failure  rate  function  for  the  system  is 

z{t)  =  -  - ^  ln(i?(0)  =  -  - ^  ln(ri  (0  +  r^{t)  +  ---  +  r^  (t)) 
at  at 

which  is  z{t)  =  Zj (t)  +  z^{t)^ —  +  z^ (t)  . 

So  the  failure  rate  for  a  series  system  equals  the  sum  of  the  failure  rates  of 
all  its  components.  As  a  result,  the  failure  rate  of  the  system  is  greater  than  the  failure  rate 
of  any  of  its  components,  and  the  whole  system  is  driven  by  the  worst  component,  which 
is  the  one  with  the  larger  failure  rate  or  the  least  reliability. 

From  the  above,  we  can  conclude  that  if  we  want  to  optimize  a  series 
system  reliability,  we  must  reduce  the  number  of  the  components,  and  if  that  is  not 
possible,  then  we  must  enhance  the  reliability  for  the  worst  component. 

For  example,  and  to  simplify,  we  may  assume  that  each  of  the  components 
in  our  system  has  an  exponential  lifetime  distribution.  Then  the  system  also  has  an 
exponential  lifetime  distribution.  If  z.{t)  =  A.  is  the  failure  rate  for  component  i,  then  the 

n 

failure  rate  for  the  system  is  Z{t)  =  A^  =  '^A.  ,  and  the  reliability  function  of  the  system 

i=l 

00 

becomes;  R(t)  =  .  Then  (E)  becomes;  MTTF^  =  J  =  1!  A^. 

0 

h.  Quantitative  Measures  of  Availability 

The  quantitative  measures  of  availability  are  listed  in  the  following 

table.  198 


197  Kuo,  pages  107-108. 

198  RAC  Toolkit,  page  12. 
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Measure 

Equation 

Reliability  &  Maintainability 
considerations 

Inherent 

Availability 

^  MTBF 

'  ~  MTBF+MTTR 

Assures  operation  under  declared 
conditions  in  an  ideal  customer  service 
environment. 

It  is  usually  not  a  field-measured 
requirement. 

Aehieved 

Availability 

,  MTBM 

"  MTBM+MTTR 

Similar  to  Ai 

Operational 

Availability 

^  _  MTBM 
“  ~  MTBM+MDT 

Extends  Ai  to  include  delays 

Reflects  the  real  world  operating 
environment 

Not  specified  as  a  manufacturer- 
controllable  requirement 

MTBF  =  Mean  Time  between  Failure 

MTTR  =  Mean  Time  to  Repair 

MTBM  =  Mean  Time  between  Maintenance 

MTTRactive=  Mean  Time  to  Repair 

MDT  =  Mean  Downtime 
(corrective  maintenance  only) 

Table  33.  The  Quantitative  Measures  of  Availability  (After  RAC  Toolkit,  page  12) 
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APPENDIX  E:  LIST  OF  ACRONYMS  AND  DEFINITIONS 


AAV  -  Advanced  Air  Vehicle 

ACTD  -  Advance  Concept  Technology  Demonstrations 

AEA  -  American  Engineering  Association 

APU  -  Auxiliary  Power  Units 

ATA  -  Air  Transport  Association 

BDA  -  Battle  Damage  Assessment 

BS  -  British  Standard 

CAAAJK  -  Civil  Aviation  Administration  from  the  UK 
CHAE  -  Conventional  High-Altitude  Endurance 
CP  -  Counter-proliferation 

DARPA  -  Defence  Advanced  Research  Projects  Agency 

DS  -  Discard 

EO  -  Electro-Optical 

EPRI  -  Electric  Power  Research  Institute 

ERAST  -  Environmental  Research  Aircraft  and  Sensor  Technology 

EAA  -  Eederal  Aviation  Authority 

EMA  -  Eailure  Mode  Analysis 

EMC  A  -  Eailure  Mode  and  Critical  Analysis 

EMEA  -  Eailure  Mode  and  Effect  Analysis 

EMECA  -  Eailure  Mode  Effect  and  Criticality  Analysis 

ERACAS  -  Eailure  Reporting  And  Corrective  Action  System 
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FTA  -  Fault  Tree  Analysis 

GCS  -  Ground  Control  Station 

GPS  -  Global  Positioning  System 

HAE  -  High- Altitude  Enduranee 

ICAO  -  International  Civil  Aviation  Organization 

IN/EC  -  Inspeetion/Eunetional  Cheek 

INS  -  Inertial  Navigation  System 

IR  -  Infrared 

ISC  -  Industry  Steering  Committee 
ISO  -  International  Standard  Organization 
JSE  -  Joint  Strike  Eighter 

E/HIRF  -  Eightning/High  Intensity  Radiated  Eield 

EOS  -  Eine-Of-Sight 

EU/SV  -  Eubrication/Servicing 

MAV  -  Miero-Air  Vehiele 

MDT  -  Mean  Downtime 

MR  -  Mishap  Rate 

MRS  -  Maintenanee  Review  Board 

MRBR  -  Maintenance  Review  Board  Report 

MSG-3  -  Maintenance  Steering  Group-3 

MSI  -  Maintenance  Significant  Items 

MTBCE  -  Mean  Time  Between  Critical  Eailure 

MTBE  -  Mean  Time  Between  Eailure 

MTBM  -  Mean  Time  between  Maintenance 
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MTTR  -  Mean  Time  to  Repair 
MUAV  -  Micro  UAV 

NASA  -  National  Aeronautics  and  Space  Administration 

NAWC/AD  -  Naval  Air  Warfare  Centre  Aircraft  Division 

NFS  -  Naval  Postgraduate  School 

NRL  -  Naval  Research  Laboratory 

O&S  -  Operation  and  Support 

OBC  -  Onboard  Computer 

OBC  -  Onboard  Computer 

OPWC  -  Operational/Visual  Check 

OTHT  -  Over  The  Horizon  Targeting 

PM  -  Planned  Maintenance 

QFD  -  Quality  Function  Deployment 

RC  -  Radio  Control 

RCM  -  Reliability  Centered  Maintenance 
RECCE  -  Reconnaissance  mission 
RPN  -  Risk  Priority  Number 
RPV  -  Remote  Piloted  Vehicles 
RS  -  Restoration 

RSTA  -  Reconnaissance  Surveillance  and  Target  Acquisition 

SAE  -  Society  of  the  Automotive  Engineers 

SAR  -  Synthetic  Aperture  Radar 

SARP  -  Standards  and  Recommended  Practices 

SEAD  -  Suppression  of  the  Enemy  Air  Defences 
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SIGINT  -  Signal  Intelligence 

SSI  -  Structural  Significant  Items 

STAN  -  Surveillance  and  Tactical  Acquisition  Network 

SUAV  -  Small  Unmanned  Aerial  Vehicle 

TAAF  -  Test,  Analyze  and  Fix 

TR  -  Tactical  Reconnaissance 

TUAV  -  Tactical  UAV 

UCAV  -  Unmanned  Combat  Aerial  Vehicle 

UHF  -  Ultra  High  Frequency 

VR  -  Vendor  Recommendations 

VTOL  -  Vertical  Take-Off  and  Landing 

WG  -  Working  Group 
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